{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "53f45502",
   "metadata": {},
   "source": [
    "# Data Processing with Numpy, Pandas, Matplotlib\n",
    "\n",
    "---\n",
    "In this chapter, we review some common libraries for helping with data processing and visualization.   \n",
    "常用数据处理包的使用方法"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e749bea3",
   "metadata": {},
   "source": [
    "## Numpy "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f27b023",
   "metadata": {},
   "source": [
    "There are many ways to create numpy arrays.  \n",
    "两种创建方式"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "83820093",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 3]\n",
      " [4 5 6]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "array1 = np.array([[1,2,3],[4,5,6]])\n",
    "print(array1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0951c3a1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-2.26918915 -0.67740413]\n",
      " [-0.21933769  0.36306447]\n",
      " [ 0.53810025 -0.79881813]\n",
      " [-0.11290804 -1.3596506 ]\n",
      " [ 1.2717553  -0.88395839]\n",
      " [-1.05240723 -2.50096135]\n",
      " [-1.58824402  0.41612032]\n",
      " [ 1.29945014 -0.49539218]\n",
      " [ 0.17166377  0.72052355]\n",
      " [-1.09496077 -0.42980521]]\n"
     ]
    }
   ],
   "source": [
    "array2 = np.random.randn(10, 2)\n",
    "print(array2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d9014e44",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0  1  2  3  4]\n",
      " [ 5  6  7  8  9]\n",
      " [10 11 12 13 14]]\n"
     ]
    }
   ],
   "source": [
    "#重整\n",
    "array3 = np.arange(15).reshape(3, 5)\n",
    "print(array3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e200b76f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
      " [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
      " [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]\n",
      " [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]\n",
      " [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]\n",
      " [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]\n",
      " [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]\n",
      " [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]\n",
      " [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]\n",
      " [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]\n"
     ]
    }
   ],
   "source": [
    "#对角阵\n",
    "identity_array = np.identity(10,dtype=np.float32)\n",
    "print(identity_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "dc2c838e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]\n",
      " [0. 0. 0.]]\n"
     ]
    }
   ],
   "source": [
    "#全零矩阵\n",
    "zero_array = np.zeros((10, 3), dtype=np.float32)\n",
    "print(zero_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "cc94c962",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]\n",
      " [1. 1. 1.]]\n"
     ]
    }
   ],
   "source": [
    "#全一矩阵\n",
    "one_array = np.ones((10, 3))\n",
    "print(one_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "28505f19",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n",
      " [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]\n"
     ]
    }
   ],
   "source": [
    "#返回与指定数组具有相同形状和数据类型的数组，并且数组中的值都为1。\n",
    "similar = np.ones_like(identity_array)\n",
    "print(similar)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ad439cf",
   "metadata": {},
   "source": [
    "More array creation routines can be found [here](https://numpy.org/devdocs/reference/routines.array-creation.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d700725b",
   "metadata": {},
   "source": [
    "## Array Indexing and Reshaping\n",
    "\n",
    "The default behavior for array indexing is the same as python slicing.  \n",
    "常见的切片与重塑操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "f3978baf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.70014975  0.95742838  1.28082844 -1.29765103 -1.36252022  0.56820401\n",
      " -0.50215937  0.55069486 -0.27959702 -0.06759568 -0.88298613  0.6576133\n",
      "  0.91346398  1.00600416 -0.18435108 -0.47628868 -0.77833753  0.65168523\n",
      " -1.13581252 -0.82610414 -0.3835191   2.05786707  0.57205623 -0.73706807\n",
      "  0.80414367 -0.2963445   0.90118868 -0.9358811   0.02182282  0.80676026\n",
      " -0.21551056 -0.07272446 -0.23836562 -0.31201107  0.00810649 -0.50549687\n",
      " -1.21044217  0.37730951  0.4135047   0.51923835  0.15351721  0.18197956\n",
      "  1.3162728  -0.32205376  0.42333362  0.58823948  1.31938608 -0.45664842\n",
      " -0.50589146 -0.42410091 -0.34573745  1.20677373 -1.45421825 -1.10572623\n",
      " -1.01452431 -0.27896849 -0.20367896  1.94992014 -0.29560535  0.55049529\n",
      " -2.1697812  -0.89344172 -1.34867772  2.07288888  1.37057159 -1.61097298\n",
      "  0.19337161 -0.49333178  0.25782928 -0.92268779  0.14101419  0.85552541\n",
      " -0.17993712  1.02870319 -0.93398416  2.09107556 -0.94274176  1.79369247\n",
      "  0.26425576 -0.24393967  1.38054005 -2.1967742   1.21174078  0.46178827\n",
      "  0.47779756 -0.66210251  0.68566411  1.56555774  0.33603159  1.9870448\n",
      " -1.2605988  -0.04680586  1.54962899 -1.39076054  0.82424308 -0.91204096\n",
      " -0.92872717  1.59438525  0.21687828  2.01539796]\n"
     ]
    }
   ],
   "source": [
    "array = np.random.randn(100, 10,15,10)\n",
    "print(array[:, 0, 0, 0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "fe049bd5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.70014975 -0.4952691   0.29410984  1.20104276 -0.42323716  1.68852364\n",
      "   1.48368738 -0.37932514  0.70134439 -0.30757804]\n",
      " [ 0.95742838  1.53876961 -0.93129724  0.46828917 -0.58184922 -0.09160302\n",
      "  -0.49441727  0.34674347 -0.61517552  1.82146846]\n",
      " [ 1.28082844 -0.38306574  0.59162851 -0.3806868  -0.47015558 -1.35481649\n",
      "  -0.85881533 -0.55402414  0.15124644 -0.70314851]\n",
      " [-1.29765103 -0.86454226 -1.05900738  0.74792977  0.1196733   0.46447492\n",
      "  -0.2975172   0.90028668 -1.24871355 -1.59482019]\n",
      " [-1.36252022 -1.34124716  0.55143774 -0.49040377  1.11306758  1.5535444\n",
      "   0.4307237  -1.50364361  0.46053101 -0.48137355]\n",
      " [ 0.56820401 -2.39618254  0.41134235  0.83462013  0.19091058 -1.24682507\n",
      "   1.70903565 -0.2550406   0.98240697 -0.08917357]\n",
      " [-0.50215937  0.69443706  0.06585498  0.06853408  0.82716076  0.57767398\n",
      "   1.49991855  0.26874776  0.98147896  0.24818144]\n",
      " [ 0.55069486 -0.71744428 -0.89166047 -0.60377608 -0.93462573 -0.78167956\n",
      "  -0.9358967   0.45752011 -1.1024511  -0.00529599]\n",
      " [-0.27959702 -0.38902454 -2.02629247 -0.08215999 -0.66530428 -0.12346358\n",
      "   0.67316058 -1.12721052 -1.59148087 -1.41550798]\n",
      " [-0.06759568  0.23744835 -0.90497029  0.39948012 -1.36009059  0.55503785\n",
      "   1.37331515  0.25742565  1.1507258  -1.21422737]\n",
      " [-0.88298613 -0.34272133  0.42675781 -0.47428805  2.88174656 -0.69192271\n",
      "   2.37582088  0.12958948 -0.17754093  0.28533938]\n",
      " [ 0.6576133   0.29805388  1.68443267 -1.92591267 -0.43232652 -0.64630716\n",
      "  -0.61415726  0.76899067 -0.03063883 -0.0408397 ]\n",
      " [ 0.91346398  0.51996898  0.32281025 -0.47337381  0.90009547  1.61209546\n",
      "   0.57335493 -0.48032998 -0.0805054  -1.58336099]\n",
      " [ 1.00600416 -2.00625136 -0.26307056  0.44245801  1.52300983  0.4687144\n",
      "   0.87323527  0.10670889  0.70869779 -1.43988287]\n",
      " [-0.18435108  2.86485706  0.00567256  0.5178202  -0.7745044  -0.1994092\n",
      "   0.3969677   1.51631822 -1.32927288  1.42222956]\n",
      " [-0.47628868  0.27596942 -0.89362046  1.73169925 -0.38153831 -0.14982756\n",
      "  -0.78872285  1.80819781  0.60019399  1.3339656 ]\n",
      " [-0.77833753  0.74137409 -0.12598338  0.19122806  0.18068031  0.21817256\n",
      "  -0.54856336  0.40727235 -2.61186933  1.24514521]\n",
      " [ 0.65168523 -0.13540964 -0.57515778  0.42004655 -1.51907822  2.75538218\n",
      "  -0.6013817  -0.54004399 -0.17800596  0.88796132]\n",
      " [-1.13581252  1.75299246  0.10136401  0.55194885  1.12893936 -0.37200425\n",
      "   0.03993212  0.65252189  0.16090419  0.1265519 ]\n",
      " [-0.82610414  0.12167624 -1.87809253  0.51071367 -0.89019014  1.55216128\n",
      "   0.27417455 -1.22340827 -1.35282594 -0.03749206]\n",
      " [-0.3835191   0.27990609 -0.64090664 -0.56722392 -0.18525656 -0.14735084\n",
      "  -0.49392766  3.2139921  -3.37076948  0.99516901]\n",
      " [ 2.05786707  2.16772414 -0.25006824 -0.49558968 -0.08742932 -1.85691071\n",
      "  -1.2155424   2.15722902  0.9150488  -1.37040269]\n",
      " [ 0.57205623  1.05744703  0.66557092  1.78680547  0.07589378  0.62077974\n",
      "   0.15064729  0.78319604 -0.38147185  0.37082126]\n",
      " [-0.73706807 -0.50908278 -0.61256987 -0.95799226 -1.40936966  1.14791975\n",
      "  -0.61250426 -0.54607088 -0.25735413 -0.92827864]\n",
      " [ 0.80414367 -1.98425993  1.04729719 -3.27729054 -0.13199755  0.88580423\n",
      "  -0.57552598 -1.74392604 -0.73297829  0.25265957]\n",
      " [-0.2963445   1.68940583  0.55566367 -2.25539099  0.68216324 -0.18967362\n",
      "   1.67442694  0.80540505 -0.12663913 -0.34753059]\n",
      " [ 0.90118868 -0.39168593  0.28486839  0.23296736 -1.19953869  0.44092557\n",
      "  -0.5558762   1.73635188  0.3130017   1.36381573]\n",
      " [-0.9358811   0.07553185 -0.9029633  -1.02310637 -0.79383486 -0.55949896\n",
      "  -1.16138471 -0.39886728  0.12359004  0.01098505]\n",
      " [ 0.02182282 -0.6985061   0.56946292  0.10479283  0.75419924 -0.42907349\n",
      "   0.43756642 -0.20115865  0.53763965  0.09219989]\n",
      " [ 0.80676026  1.05896972  0.77397496  0.38576567 -0.73901174 -0.19054655\n",
      "  -0.96913791 -0.41194478 -0.00564763  1.39239938]\n",
      " [-0.21551056 -0.45965938 -1.89737319 -0.56008217 -0.38001048  1.4970101\n",
      "  -1.09632809  0.46361928 -0.75171828 -0.67722078]\n",
      " [-0.07272446  1.77892064  0.57874904 -0.44129269  0.45303865  1.1237239\n",
      "   1.13204992 -0.93879952 -0.33159474 -1.41282524]\n",
      " [-0.23836562 -1.22498338  1.60357325  0.15645074  0.59569215 -2.68087497\n",
      "  -1.04673147 -0.08935697  0.95833065  0.64297548]\n",
      " [-0.31201107  0.67078454  0.88536301 -1.47522448 -0.36328882 -1.26448393\n",
      "   0.64700527 -1.18881596  0.64793433 -1.39183813]\n",
      " [ 0.00810649  0.0597636   0.99356614 -1.49447536 -1.01155257 -0.64072938\n",
      "  -2.11487273  0.66477815  1.6773563   0.87572017]\n",
      " [-0.50549687  0.46023374  0.31376673  0.25566195  1.60365098  0.39381558\n",
      "  -0.08845004 -0.49986157  0.59299473  0.51159565]\n",
      " [-1.21044217  0.18488956  0.41345137  1.24350544  1.15645598 -1.23189644\n",
      "  -0.91268306 -0.7742684   1.02657    -2.09008714]\n",
      " [ 0.37730951  0.54358879  0.23163213 -0.22937755  0.24975991  0.8100931\n",
      "  -1.83415479 -1.33883002  1.38050586 -0.43602602]\n",
      " [ 0.4135047  -3.19405621  0.18796721 -0.61606169  0.55852603 -0.76511599\n",
      "   0.51378254 -1.660782    0.49990145 -0.4738475 ]\n",
      " [ 0.51923835 -0.19813     1.18062267  0.06866698 -0.36240557 -0.09360401\n",
      "   0.14778437  0.03864805 -0.43981106  1.89588612]\n",
      " [ 0.15351721 -0.11866635 -0.30766497 -0.87228891  0.56325253 -1.97157263\n",
      "   0.66735404  0.63313371 -0.73391688 -0.88640874]\n",
      " [ 0.18197956  0.02328974 -0.1550263  -1.41121545  0.12075482 -1.19144875\n",
      "   0.24620269 -0.75325178 -0.36333288 -0.56147462]\n",
      " [ 1.3162728  -0.33465476  1.76473656  0.1771462  -0.88179809  0.71005787\n",
      "  -0.62883887 -0.91478525  1.07883949 -0.07867293]\n",
      " [-0.32205376 -0.52558231  1.54976781  0.20889623  0.70314367  1.60147645\n",
      "   0.90122629  0.35756791 -0.38260578 -1.16330092]\n",
      " [ 0.42333362 -1.36401473  0.40464745 -0.33317535 -0.54777001 -0.21678756\n",
      "  -1.08701971 -0.37415351 -0.55006986 -0.03028376]\n",
      " [ 0.58823948 -0.49716213 -2.26936958  0.2807482  -0.67795191 -0.90616955\n",
      "  -1.08843253  0.78320948  1.11422557 -2.82462978]\n",
      " [ 1.31938608  0.5542335   0.67688763  0.23804997 -0.05367111 -0.56377209\n",
      "   0.18755422 -1.58569607  0.58305838  1.06915786]\n",
      " [-0.45664842 -0.08718662  0.09209277  0.45282271  1.50349931 -0.30542974\n",
      "  -1.3728109   1.63781217 -1.01713269 -1.38204913]\n",
      " [-0.50589146 -1.31203679  0.79356673 -0.46469107  0.32170188  1.13695832\n",
      "   0.93955027 -1.39367353  1.0282871   0.18378863]\n",
      " [-0.42410091 -1.27781818  0.43273804  0.66910997 -0.47014571 -0.01155601\n",
      "  -0.41115642 -1.74331785  0.82681023 -1.77391986]\n",
      " [-0.34573745 -1.21185569 -1.24283374 -0.90759782  1.45235056 -0.56351295\n",
      "  -1.22900088 -0.74496938 -0.12218276  0.59406789]\n",
      " [ 1.20677373  0.05476731  0.20594457  0.31590328 -0.57188033  0.19791405\n",
      "   0.60501768  0.3294223  -0.05383447  1.43359377]\n",
      " [-1.45421825 -1.12823083 -0.77965682 -0.06683676 -0.31038007 -0.09688226\n",
      "   0.94241434 -1.71360728 -0.83651862  1.27175497]\n",
      " [-1.10572623  0.1428526  -0.35576768  1.0408522  -1.30729341  1.34539508\n",
      "  -0.58976687 -0.65559516  0.23385219  1.61228775]\n",
      " [-1.01452431  0.4826287   0.35315426 -0.23453441 -1.21610238 -0.46406329\n",
      "   3.44007484  1.04524664 -0.18624335 -0.91973566]\n",
      " [-0.27896849  1.08181781  0.61025603 -1.2822727   0.71414991 -1.88239717\n",
      "   0.622389   -1.00764577 -1.52649693 -1.22416601]\n",
      " [-0.20367896 -0.33904316  0.24739776 -0.58373121  1.22784493 -0.3713069\n",
      "  -0.46796753  0.52236718 -1.33332671  1.37102405]\n",
      " [ 1.94992014 -1.29622749 -1.00227244  1.54988283  1.27550067 -0.42185998\n",
      "   0.78547755  1.03570844  0.19395279  0.39280611]\n",
      " [-0.29560535 -1.61985824  0.60381636  0.21038881  2.22834722 -0.7680946\n",
      "   0.35462636 -0.11618641  2.04611755  0.87784416]\n",
      " [ 0.55049529 -0.42581136  1.02184527 -0.3708925   0.17552212  0.76148543\n",
      "  -0.78329519 -0.42459474 -1.03979751 -0.13403031]\n",
      " [-2.1697812   1.24188149  0.9153638   0.39835622  1.72517536  1.96541673\n",
      "   0.3922005  -0.62074965 -0.34799148 -0.9243054 ]\n",
      " [-0.89344172 -1.09935864  0.29988995  0.41845366 -0.23453697 -2.03164149\n",
      "  -1.45360778 -1.27828056  1.4236342   0.93971577]\n",
      " [-1.34867772 -0.02373789 -0.5106762  -0.23525682 -0.67662349 -0.59293792\n",
      "   0.30153914 -0.47531727 -0.83498145  0.32871324]\n",
      " [ 2.07288888  1.08644546  0.56342769 -0.18860793 -0.12631159 -1.01627621\n",
      "  -0.57785242 -0.88599485  1.21895697 -0.17993138]\n",
      " [ 1.37057159 -0.79691538 -0.01985902  0.86976055  0.31539505  0.7689727\n",
      "  -1.86466866 -0.64316531  1.59015725 -0.8872665 ]\n",
      " [-1.61097298 -1.86674429  1.51691165 -1.78632435 -0.57840027  2.68727181\n",
      "  -0.64496365 -0.20577597  0.75214173  0.87198727]\n",
      " [ 0.19337161  0.98107794  0.31933513 -1.34997811 -0.21589918 -0.39056055\n",
      "   0.8487978   0.2103179  -0.78469156 -0.30451148]\n",
      " [-0.49333178  0.8489178   0.17876883  2.0148987  -0.73822137  0.47174224\n",
      "  -1.34257465  0.15461474  0.27481849 -0.38877965]\n",
      " [ 0.25782928 -0.58772373 -1.04079356  0.21588632 -1.33314018 -0.08143482\n",
      "   1.80343421 -1.77070581  0.69997224 -0.49131899]\n",
      " [-0.92268779  0.24385856  0.67360085  1.88438943  1.03526055 -0.81359144\n",
      "   0.19230882 -1.58401841 -1.16002468  0.93423211]\n",
      " [ 0.14101419 -0.23090664 -0.57221845 -0.37641373 -1.1691792   1.63365404\n",
      "  -0.53927791  0.33354287  0.17354808  0.01092095]\n",
      " [ 0.85552541  0.52426478 -1.08265967  0.33593078 -1.5844397   0.30940066\n",
      "   1.17184228 -1.25427688 -0.07752534 -0.90628138]\n",
      " [-0.17993712  0.42241742 -0.12395227  0.18102797  0.00764985  1.37228748\n",
      "  -0.59995828 -0.63577474  0.21161429 -1.29195205]\n",
      " [ 1.02870319  1.16047704  0.38447925 -1.77952336 -0.14075338 -0.12820995\n",
      "  -0.20815333 -1.21691492 -0.21658602  0.20118706]\n",
      " [-0.93398416  1.94365482  1.1464755   0.40117145 -0.05669613 -1.01495\n",
      "  -0.21999103  0.41420561  2.07100518 -0.67533062]\n",
      " [ 2.09107556  1.53408984  1.25835135 -0.49070542  0.35890386 -1.17493694\n",
      "   0.54972908 -1.22817251 -0.10607217 -1.09552464]\n",
      " [-0.94274176 -2.01282421  0.73873478  0.60131435  1.1877213  -0.94434775\n",
      "   1.67920027 -0.80381701 -1.61742464  0.10818906]\n",
      " [ 1.79369247  1.62688003 -1.74231326 -2.25665163 -0.38227781 -0.16257778\n",
      "   0.8060961   0.81308595 -0.0363248   0.11660015]\n",
      " [ 0.26425576 -0.89827897 -1.4450494   0.66275391  0.98134403  0.67821961\n",
      "  -0.9351219  -0.05100886 -0.58249859  0.57851761]\n",
      " [-0.24393967 -0.28726607  1.3364499   1.35762493 -0.90733317 -0.87021017\n",
      "  -1.12734937  1.03076381 -1.01835702 -1.30912323]\n",
      " [ 1.38054005 -0.22315013  0.95629994  0.3374439   1.55916925 -0.02735626\n",
      "  -1.12874028 -1.61083445 -0.7486289  -0.70322005]\n",
      " [-2.1967742   0.15514308  0.28504209 -0.93981393 -1.25572707 -1.02644697\n",
      "   0.35873467 -0.74842845 -1.26052871 -0.91574624]\n",
      " [ 1.21174078 -0.16578769 -0.24734684 -0.25199211 -0.00623968  0.0707563\n",
      "  -2.09903977  1.41485024  0.51763841 -0.30532224]\n",
      " [ 0.46178827 -1.77647731 -0.21695614 -0.15197668 -0.94775993  2.10433905\n",
      "   0.39709466 -0.32020111 -0.51973825 -1.22192124]\n",
      " [ 0.47779756  1.7642487   1.64742824  0.56782247  0.35237662 -0.31522107\n",
      "   0.39785128  0.88008357 -0.43845865  1.18041534]\n",
      " [-0.66210251 -0.881653   -0.70352717  0.26383881  0.80723498 -0.32513029\n",
      "  -0.61025604 -1.32519981 -1.07266694  1.0399818 ]\n",
      " [ 0.68566411 -0.49547786  0.13844485  0.54083662  1.16635177  0.97996399\n",
      "  -1.48804215  0.34207249  0.93833981 -1.22543782]\n",
      " [ 1.56555774  0.79027085 -0.96624404  0.34505121  0.30125439 -0.25025285\n",
      "  -1.16749103 -1.55949768 -0.70960649  0.52890496]\n",
      " [ 0.33603159 -1.90568272  1.35840318  0.77817832 -0.0264802   1.37097553\n",
      "   0.37560681 -0.82193861  1.12640669 -0.75322504]\n",
      " [ 1.9870448   1.05319936 -0.67159625  0.20172183  1.41195399  2.80215805\n",
      "   1.68430028 -0.81001812  1.25523093  0.87565567]\n",
      " [-1.2605988  -0.11852197 -0.71909121 -0.53309588  0.26543225 -1.07999359\n",
      "   0.6779507   1.7279905   0.71197513  0.19228456]\n",
      " [-0.04680586  0.75986054  0.29652783  0.06368187 -0.79017409 -0.80427619\n",
      "   0.45748918 -0.87724886 -1.48782354 -1.77624741]\n",
      " [ 1.54962899 -0.02732902  0.31934205  0.14626293  1.01646644  0.85578295\n",
      "   0.11495456  0.11052964  1.17346548  2.04626301]\n",
      " [-1.39076054 -1.38272153  0.44792249 -1.60853627 -0.31312208  1.1535834\n",
      "  -0.12264751  0.31225877  0.51870374 -1.38166861]\n",
      " [ 0.82424308  1.16270458  1.55164608 -2.31197105 -0.01449636  1.17262545\n",
      "   0.12144954  2.01789975  0.87965627  0.83605122]\n",
      " [-0.91204096  0.99998706 -1.03339501 -1.36462722  0.93242113  0.10718831\n",
      "  -0.11919095  0.41016432 -0.15014296 -0.08065396]\n",
      " [-0.92872717 -0.20794408 -1.53066089  0.75843527 -0.10966734  0.63947109\n",
      "  -0.27603973 -2.16960839 -0.07939813  1.10809456]\n",
      " [ 1.59438525  0.02429425 -1.57794478  0.61725614  0.30686051  0.06563727\n",
      "   0.09687418 -0.3093951   0.37030335  0.26382806]\n",
      " [ 0.21687828  0.551932    1.34923687  0.25064239 -2.25294847  1.91393661\n",
      "  -0.31140637 -2.06719945 -0.64568967  0.57514024]\n",
      " [ 2.01539796 -0.62540742  2.22853129  1.29014494 -0.17692815 -1.56703975\n",
      "   0.92971677 -0.85406408 -1.47744376 -0.02937027]]\n"
     ]
    }
   ],
   "source": [
    "print(array[:, :, 0, 0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "27f8e77c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.70014975 -0.4952691   0.29410984  1.20104276 -0.42323716  1.68852364\n",
      "   1.48368738 -0.37932514  0.70134439 -0.30757804]\n",
      " [ 0.95742838  1.53876961 -0.93129724  0.46828917 -0.58184922 -0.09160302\n",
      "  -0.49441727  0.34674347 -0.61517552  1.82146846]\n",
      " [ 1.28082844 -0.38306574  0.59162851 -0.3806868  -0.47015558 -1.35481649\n",
      "  -0.85881533 -0.55402414  0.15124644 -0.70314851]\n",
      " [-1.29765103 -0.86454226 -1.05900738  0.74792977  0.1196733   0.46447492\n",
      "  -0.2975172   0.90028668 -1.24871355 -1.59482019]\n",
      " [-1.36252022 -1.34124716  0.55143774 -0.49040377  1.11306758  1.5535444\n",
      "   0.4307237  -1.50364361  0.46053101 -0.48137355]\n",
      " [ 0.56820401 -2.39618254  0.41134235  0.83462013  0.19091058 -1.24682507\n",
      "   1.70903565 -0.2550406   0.98240697 -0.08917357]\n",
      " [-0.50215937  0.69443706  0.06585498  0.06853408  0.82716076  0.57767398\n",
      "   1.49991855  0.26874776  0.98147896  0.24818144]\n",
      " [ 0.55069486 -0.71744428 -0.89166047 -0.60377608 -0.93462573 -0.78167956\n",
      "  -0.9358967   0.45752011 -1.1024511  -0.00529599]\n",
      " [-0.27959702 -0.38902454 -2.02629247 -0.08215999 -0.66530428 -0.12346358\n",
      "   0.67316058 -1.12721052 -1.59148087 -1.41550798]\n",
      " [-0.06759568  0.23744835 -0.90497029  0.39948012 -1.36009059  0.55503785\n",
      "   1.37331515  0.25742565  1.1507258  -1.21422737]]\n"
     ]
    }
   ],
   "source": [
    "print(array[0:10, :, 0, 0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "7a5c1cd1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.70014975 -1.29765103 -0.50215937 -0.06759568]\n"
     ]
    }
   ],
   "source": [
    "print(array[0:10:3, 0, 0, 0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "0383fd29",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.70014975  0.95742838  1.28082844 -1.29765103 -1.36252022  0.56820401\n",
      " -0.50215937  0.55069486 -0.27959702 -0.06759568 -0.88298613  0.6576133\n",
      "  0.91346398  1.00600416 -0.18435108 -0.47628868 -0.77833753  0.65168523\n",
      " -1.13581252 -0.82610414 -0.3835191   2.05786707  0.57205623 -0.73706807\n",
      "  0.80414367 -0.2963445   0.90118868 -0.9358811   0.02182282  0.80676026\n",
      " -0.21551056 -0.07272446 -0.23836562 -0.31201107  0.00810649 -0.50549687\n",
      " -1.21044217  0.37730951  0.4135047   0.51923835  0.15351721  0.18197956\n",
      "  1.3162728  -0.32205376  0.42333362  0.58823948  1.31938608 -0.45664842\n",
      " -0.50589146 -0.42410091 -0.34573745  1.20677373 -1.45421825 -1.10572623\n",
      " -1.01452431 -0.27896849 -0.20367896  1.94992014 -0.29560535  0.55049529\n",
      " -2.1697812  -0.89344172 -1.34867772  2.07288888  1.37057159 -1.61097298\n",
      "  0.19337161 -0.49333178  0.25782928 -0.92268779  0.14101419  0.85552541\n",
      " -0.17993712  1.02870319 -0.93398416  2.09107556 -0.94274176  1.79369247\n",
      "  0.26425576 -0.24393967  1.38054005 -2.1967742   1.21174078  0.46178827\n",
      "  0.47779756 -0.66210251  0.68566411  1.56555774  0.33603159  1.9870448\n",
      " -1.2605988  -0.04680586  1.54962899 -1.39076054  0.82424308 -0.91204096\n",
      " -0.92872717  1.59438525]\n"
     ]
    }
   ],
   "source": [
    "print(array[:-2, 0, 0, 0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3511d23c",
   "metadata": {},
   "source": [
    "**Pay** attention to the following code! Slicing does not create a copy!\n",
    "<br/>注意以下的切片和原始矩阵不一样了"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "6efd8b1c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.80013324  0.56193112 -1.18700981  0.13639308 -1.05447891  1.48171733\n",
      "   1.49270938 -2.33944076 -0.16382534  0.85158009]\n",
      " [ 1.35848243  0.90676562  0.30696525 -0.68292076 -0.03698499  1.45614388\n",
      "   0.03857497 -0.63290897 -0.70411049  0.30065094]\n",
      " [ 1.65184023  0.63113275  0.0316235  -0.86919686 -0.02017094 -0.58627194\n",
      "  -0.94673218 -0.15120437  0.13029738  0.05075978]\n",
      " [-0.30381056 -0.07385902 -0.329443   -1.03933421  0.47225135 -0.7075521\n",
      "  -0.34561292 -1.04467692 -0.16076133  0.38403796]\n",
      " [-0.91787722 -0.4488372  -1.04905802  0.51773566 -1.30832445 -0.55520423\n",
      "   0.91223157  0.3844675  -0.02843521 -0.45059915]\n",
      " [-0.90209091 -0.00407952  0.65200542 -0.66548712 -0.75617436 -0.10415422\n",
      "  -0.74790929 -0.28583401 -0.36655664  0.34095067]\n",
      " [ 0.66019199  0.03072407  0.6919309   0.00553084 -0.52250756 -0.92026228\n",
      "   1.91073149  1.68979207  3.43358887  0.88743568]\n",
      " [ 0.12723791 -0.44221849  0.00807638  0.34365881  1.31380763  0.59126669\n",
      "  -1.31769367 -0.14732265  2.19464088 -0.23794102]\n",
      " [-0.27959001 -0.59612307 -0.6629633  -0.02134111 -0.23396332 -0.36479829\n",
      "   0.68831884  0.86714724 -1.2434636  -1.26922124]\n",
      " [-0.1975982   1.2997065  -1.06142707  0.21570192 -0.20777458  0.44268076\n",
      "   0.96489459 -0.69320258 -2.11865552 -0.72221438]]\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(10,10)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "f128b59f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1.          0.56193112 -1.18700981  0.13639308 -1.05447891  1.48171733\n",
      "   1.49270938 -2.33944076 -0.16382534  0.85158009]\n",
      " [ 1.          0.90676562  0.30696525 -0.68292076 -0.03698499  1.45614388\n",
      "   0.03857497 -0.63290897 -0.70411049  0.30065094]\n",
      " [ 1.          0.63113275  0.0316235  -0.86919686 -0.02017094 -0.58627194\n",
      "  -0.94673218 -0.15120437  0.13029738  0.05075978]\n",
      " [ 1.         -0.07385902 -0.329443   -1.03933421  0.47225135 -0.7075521\n",
      "  -0.34561292 -1.04467692 -0.16076133  0.38403796]\n",
      " [ 1.         -0.4488372  -1.04905802  0.51773566 -1.30832445 -0.55520423\n",
      "   0.91223157  0.3844675  -0.02843521 -0.45059915]\n",
      " [ 1.         -0.00407952  0.65200542 -0.66548712 -0.75617436 -0.10415422\n",
      "  -0.74790929 -0.28583401 -0.36655664  0.34095067]\n",
      " [ 1.          0.03072407  0.6919309   0.00553084 -0.52250756 -0.92026228\n",
      "   1.91073149  1.68979207  3.43358887  0.88743568]\n",
      " [ 1.         -0.44221849  0.00807638  0.34365881  1.31380763  0.59126669\n",
      "  -1.31769367 -0.14732265  2.19464088 -0.23794102]\n",
      " [ 1.         -0.59612307 -0.6629633  -0.02134111 -0.23396332 -0.36479829\n",
      "   0.68831884  0.86714724 -1.2434636  -1.26922124]\n",
      " [ 1.          1.2997065  -1.06142707  0.21570192 -0.20777458  0.44268076\n",
      "   0.96489459 -0.69320258 -2.11865552 -0.72221438]]\n"
     ]
    }
   ],
   "source": [
    "y = x[:, 0]\n",
    "y[:]=1\n",
    "print(x) # x is changed as well"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "817bd7f0",
   "metadata": {},
   "source": [
    "To prevent unwanted reference, use `copy` methods instead. \n",
    "\n",
    "Note that this will be a common pattern!\n",
    "<br/>使用拷贝函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "0a314cce",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1.          0.56193112 -1.18700981  0.13639308 -1.05447891  1.48171733\n",
      "   1.49270938 -2.33944076 -0.16382534  0.85158009]\n",
      " [ 1.          0.90676562  0.30696525 -0.68292076 -0.03698499  1.45614388\n",
      "   0.03857497 -0.63290897 -0.70411049  0.30065094]\n",
      " [ 1.          0.63113275  0.0316235  -0.86919686 -0.02017094 -0.58627194\n",
      "  -0.94673218 -0.15120437  0.13029738  0.05075978]\n",
      " [ 1.         -0.07385902 -0.329443   -1.03933421  0.47225135 -0.7075521\n",
      "  -0.34561292 -1.04467692 -0.16076133  0.38403796]\n",
      " [ 1.         -0.4488372  -1.04905802  0.51773566 -1.30832445 -0.55520423\n",
      "   0.91223157  0.3844675  -0.02843521 -0.45059915]\n",
      " [ 1.         -0.00407952  0.65200542 -0.66548712 -0.75617436 -0.10415422\n",
      "  -0.74790929 -0.28583401 -0.36655664  0.34095067]\n",
      " [ 1.          0.03072407  0.6919309   0.00553084 -0.52250756 -0.92026228\n",
      "   1.91073149  1.68979207  3.43358887  0.88743568]\n",
      " [ 1.         -0.44221849  0.00807638  0.34365881  1.31380763  0.59126669\n",
      "  -1.31769367 -0.14732265  2.19464088 -0.23794102]\n",
      " [ 1.         -0.59612307 -0.6629633  -0.02134111 -0.23396332 -0.36479829\n",
      "   0.68831884  0.86714724 -1.2434636  -1.26922124]\n",
      " [ 1.          1.2997065  -1.06142707  0.21570192 -0.20777458  0.44268076\n",
      "   0.96489459 -0.69320258 -2.11865552 -0.72221438]]\n"
     ]
    }
   ],
   "source": [
    "z = x[0,:].copy()\n",
    "z[:]=10 # Nothing should change\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "67fba826",
   "metadata": {},
   "source": [
    "In addition, one can also index the array using *boolean conditions*. We leave these tasks to pandas.  \n",
    "Here we demonstrate how to change the shape of tensors. This is a very common approach in building neural networks. \n",
    "<br/>注意重塑的过程"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "6da9ffd1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 1, 5)\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(10, 5)\n",
    "print(x[:, None, :].shape) # This will add a new dimension"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "1e2f5d86",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 1, 5)\n"
     ]
    }
   ],
   "source": [
    "print(np.expand_dims(x, 1).shape) # Doing this has the same effect"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "5ee1536b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 5)\n"
     ]
    }
   ],
   "source": [
    "#np.expand_dims 与 np.squeeze, 这两个函数互为逆操作.\n",
    "x = x[:, None, :]\n",
    "print(x.squeeze().shape) # Doing this removes the 'extra' dimension"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1fea1560",
   "metadata": {},
   "source": [
    "Note that `x.reshape(...)` behaves as one would expect. However, in situations when the dimensionality of a tensor is high, it can lead to quite some confusion. Thus we introduce *einops* package (see [here](https://github.com/arogozhnikov/einops)). \n",
    "<br/>更好用的维度变化包：einops\n",
    "<br/>一些图像处理的例子：https://zhuanlan.zhihu.com/p/372692913"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "1cf49ae4",
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install einops"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "128e878a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from einops import rearrange, repeat, reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "f5bae364",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 4, 2, 2)\n",
      "[[[[-0.37564502 -1.24405839]\n",
      "   [-2.6787321  -0.03052736]]\n",
      "\n",
      "  [[ 1.09235281  0.25984645]\n",
      "   [-0.8590106   0.62117372]]\n",
      "\n",
      "  [[ 0.30536256  1.19480723]\n",
      "   [ 0.00320785  0.65042704]]\n",
      "\n",
      "  [[-0.86583669  1.52994335]\n",
      "   [ 0.50362747 -0.71632619]]]\n",
      "\n",
      "\n",
      " [[[ 0.5529676  -1.57597508]\n",
      "   [-0.80954614 -1.71115766]]\n",
      "\n",
      "  [[-0.13984377  0.50948613]\n",
      "   [-0.32109307  0.38223505]]\n",
      "\n",
      "  [[-0.14136352  1.89912421]\n",
      "   [-1.04839409 -0.05975481]]\n",
      "\n",
      "  [[-1.20365798  0.44589965]\n",
      "   [-0.92025325 -0.30946765]]]\n",
      "\n",
      "\n",
      " [[[-1.1142685  -0.05659579]\n",
      "   [-0.60698567 -1.62324938]]\n",
      "\n",
      "  [[-0.74631961  1.29185805]\n",
      "   [-0.66991566 -0.18532031]]\n",
      "\n",
      "  [[-0.9544522   2.05773371]\n",
      "   [ 1.44339281  0.62639747]]\n",
      "\n",
      "  [[ 0.27169769 -0.82599423]\n",
      "   [ 0.63710851 -1.01390089]]]\n",
      "\n",
      "\n",
      " [[[ 0.20895844 -0.03759262]\n",
      "   [-0.87324341  0.98350727]]\n",
      "\n",
      "  [[-1.04896633 -0.00493259]\n",
      "   [-1.62797524 -0.43615948]]\n",
      "\n",
      "  [[ 0.84417939 -1.19900295]\n",
      "   [-0.57231259 -0.8150348 ]]\n",
      "\n",
      "  [[-0.29933207 -0.42662218]\n",
      "   [-1.84058001 -0.60484153]]]\n",
      "\n",
      "\n",
      " [[[-1.16293775  1.10182246]\n",
      "   [ 1.12538184  0.69877786]]\n",
      "\n",
      "  [[-0.3226025  -0.48464304]\n",
      "   [ 2.19424981 -0.1990565 ]]\n",
      "\n",
      "  [[ 0.3535688   0.68892536]\n",
      "   [-1.20988876 -1.3289102 ]]\n",
      "\n",
      "  [[ 1.82773686  0.47356345]\n",
      "   [ 0.26112435 -0.37978381]]]\n",
      "\n",
      "\n",
      " [[[-0.39984212 -1.41714462]\n",
      "   [-0.3826965  -1.26517787]]\n",
      "\n",
      "  [[-1.90389924  0.98041821]\n",
      "   [-0.88304092 -1.20069422]]\n",
      "\n",
      "  [[ 1.39493949 -0.29670571]\n",
      "   [-1.22965412 -0.53425737]]\n",
      "\n",
      "  [[-0.21499888  0.60881301]\n",
      "   [-0.15193392  1.5296876 ]]]\n",
      "\n",
      "\n",
      " [[[-0.79310933 -1.07641727]\n",
      "   [-1.19038121  0.15729712]]\n",
      "\n",
      "  [[ 0.67100897  0.11673113]\n",
      "   [ 0.41714645  1.26993833]]\n",
      "\n",
      "  [[-0.74840108  0.06118796]\n",
      "   [-0.16789457 -1.74986493]]\n",
      "\n",
      "  [[ 0.5718798  -0.59672411]\n",
      "   [-0.28022224 -0.54290606]]]\n",
      "\n",
      "\n",
      " [[[ 0.59408574 -0.95796711]\n",
      "   [-1.32546962 -2.24923179]]\n",
      "\n",
      "  [[ 0.25259887  0.25371669]\n",
      "   [ 0.299004    0.86126378]]\n",
      "\n",
      "  [[-0.07636888  1.43255309]\n",
      "   [ 1.35911651  0.83937602]]\n",
      "\n",
      "  [[-1.41107534  0.61158941]\n",
      "   [ 0.10796888  0.37255751]]]\n",
      "\n",
      "\n",
      " [[[-0.9042211   0.0839715 ]\n",
      "   [ 0.72009967  0.08978855]]\n",
      "\n",
      "  [[-1.41790782 -0.10054676]\n",
      "   [ 0.68597249 -0.33429569]]\n",
      "\n",
      "  [[ 0.74685158 -0.02858443]\n",
      "   [-0.62136201  0.17158611]]\n",
      "\n",
      "  [[-0.1454146  -0.31989205]\n",
      "   [ 0.30011368  1.13751235]]]\n",
      "\n",
      "\n",
      " [[[-0.56311495  0.61514464]\n",
      "   [ 0.75776597 -0.53779579]]\n",
      "\n",
      "  [[-0.63715977  0.98182696]\n",
      "   [-0.23729584  0.81548081]]\n",
      "\n",
      "  [[ 0.03509847 -0.23196932]\n",
      "   [-0.52770714 -0.96301411]]\n",
      "\n",
      "  [[ 0.41084324 -0.15062808]\n",
      "   [ 0.70142328  0.33153791]]]]\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(10,4, 2, 2)\n",
    "print(x.shape)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "c306e671",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 4, 4)\n",
      "[[[-0.37564502 -1.24405839 -2.6787321  -0.03052736]\n",
      "  [ 1.09235281  0.25984645 -0.8590106   0.62117372]\n",
      "  [ 0.30536256  1.19480723  0.00320785  0.65042704]\n",
      "  [-0.86583669  1.52994335  0.50362747 -0.71632619]]\n",
      "\n",
      " [[ 0.5529676  -1.57597508 -0.80954614 -1.71115766]\n",
      "  [-0.13984377  0.50948613 -0.32109307  0.38223505]\n",
      "  [-0.14136352  1.89912421 -1.04839409 -0.05975481]\n",
      "  [-1.20365798  0.44589965 -0.92025325 -0.30946765]]\n",
      "\n",
      " [[-1.1142685  -0.05659579 -0.60698567 -1.62324938]\n",
      "  [-0.74631961  1.29185805 -0.66991566 -0.18532031]\n",
      "  [-0.9544522   2.05773371  1.44339281  0.62639747]\n",
      "  [ 0.27169769 -0.82599423  0.63710851 -1.01390089]]\n",
      "\n",
      " [[ 0.20895844 -0.03759262 -0.87324341  0.98350727]\n",
      "  [-1.04896633 -0.00493259 -1.62797524 -0.43615948]\n",
      "  [ 0.84417939 -1.19900295 -0.57231259 -0.8150348 ]\n",
      "  [-0.29933207 -0.42662218 -1.84058001 -0.60484153]]\n",
      "\n",
      " [[-1.16293775  1.10182246  1.12538184  0.69877786]\n",
      "  [-0.3226025  -0.48464304  2.19424981 -0.1990565 ]\n",
      "  [ 0.3535688   0.68892536 -1.20988876 -1.3289102 ]\n",
      "  [ 1.82773686  0.47356345  0.26112435 -0.37978381]]\n",
      "\n",
      " [[-0.39984212 -1.41714462 -0.3826965  -1.26517787]\n",
      "  [-1.90389924  0.98041821 -0.88304092 -1.20069422]\n",
      "  [ 1.39493949 -0.29670571 -1.22965412 -0.53425737]\n",
      "  [-0.21499888  0.60881301 -0.15193392  1.5296876 ]]\n",
      "\n",
      " [[-0.79310933 -1.07641727 -1.19038121  0.15729712]\n",
      "  [ 0.67100897  0.11673113  0.41714645  1.26993833]\n",
      "  [-0.74840108  0.06118796 -0.16789457 -1.74986493]\n",
      "  [ 0.5718798  -0.59672411 -0.28022224 -0.54290606]]\n",
      "\n",
      " [[ 0.59408574 -0.95796711 -1.32546962 -2.24923179]\n",
      "  [ 0.25259887  0.25371669  0.299004    0.86126378]\n",
      "  [-0.07636888  1.43255309  1.35911651  0.83937602]\n",
      "  [-1.41107534  0.61158941  0.10796888  0.37255751]]\n",
      "\n",
      " [[-0.9042211   0.0839715   0.72009967  0.08978855]\n",
      "  [-1.41790782 -0.10054676  0.68597249 -0.33429569]\n",
      "  [ 0.74685158 -0.02858443 -0.62136201  0.17158611]\n",
      "  [-0.1454146  -0.31989205  0.30011368  1.13751235]]\n",
      "\n",
      " [[-0.56311495  0.61514464  0.75776597 -0.53779579]\n",
      "  [-0.63715977  0.98182696 -0.23729584  0.81548081]\n",
      "  [ 0.03509847 -0.23196932 -0.52770714 -0.96301411]\n",
      "  [ 0.41084324 -0.15062808  0.70142328  0.33153791]]]\n"
     ]
    }
   ],
   "source": [
    "print(rearrange(x, 'b h j k -> b h (j k)').shape)\n",
    "print(rearrange(x, 'b h j k -> b h (j k)'))# 在k维度合并"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "5cd06971",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 2, 4, 2)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rearrange(x, 'b h j k -> b j h k').shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "7a47eed7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 5, 4, 4)"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rearrange(x, '(b1 b2) h j k ->b1 b2 (j k) h', b1=2).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "8538ed90",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 4, 2)"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(x, 'b h j k -> b h j', 'sum').shape#维度求和"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "d9f14561",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 2, 2, 2)"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(x, 'b (h1 h2) j k -> b h2 j k', 'max', h1=2).shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c959d247",
   "metadata": {},
   "source": [
    "## Broadcast and Eimsum\n",
    "\n",
    "The following contents are some of the most difficult (but very essential) for building neural networks. Let us start with a simple example. \n",
    "<br/>广播与爱因斯坦求和操作"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "229df98f",
   "metadata": {},
   "source": [
    "### Broadcast\n",
    "\n",
    "Essentially, broadcast is a mechanism to avoid writing repeat. \n",
    "<br/>广播是一种避免重复写代码的方法"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "76cd285d",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.random.randn(10)\n",
    "y = np.random.randn(4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "2ba3e9b7",
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (10,) (4,) ",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-37-01c2401e2a66>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0my\u001b[0m \u001b[0;31m# This will not work\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (10,) (4,) "
     ]
    }
   ],
   "source": [
    "x*y # This will not work"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "50710839",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.00343767, -0.02178648,  0.01507611,  0.02722652],\n",
       "       [ 0.02830096, -0.17935939,  0.12411559,  0.22414499],\n",
       "       [-0.26141348,  1.65672663, -1.14644463, -2.07040723],\n",
       "       [ 0.05944618, -0.37674442,  0.26070482,  0.47081658],\n",
       "       [ 0.02126208, -0.13474994,  0.09324613,  0.16839667],\n",
       "       [-0.1017664 ,  0.64495185, -0.44630271, -0.80599476],\n",
       "       [ 0.14066737, -0.89148949,  0.61690524,  1.11409224],\n",
       "       [ 0.09446827, -0.5986994 ,  0.4142963 ,  0.74819318],\n",
       "       [-0.03465678,  0.21963982, -0.1519894 , -0.27448335],\n",
       "       [ 0.22186362, -1.40607654,  0.97299631,  1.75717043]])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[:, None]*y # This works however"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "229b1bbb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(100, 1, 10, 5)\n",
      "(5,)\n"
     ]
    }
   ],
   "source": [
    "print(x[:, None].shape)\n",
    "print(y.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "256eed7e",
   "metadata": {},
   "source": [
    "原则：\n",
    "如果两个数组的后缘维度(从末尾开始算起的维度)的轴长度相符或其中一方的长度为1，  \n",
    "则认为它们是广播兼容的。广播会在缺失维度和(或)轴长度为1的维度上进行。\n",
    "\n",
    "对于一个标量，可以将这一个数字的形状看成是一行一列；  \n",
    "对于一个一维数组，可以将它的形状看成是一行多列；\n",
    "\n",
    "总结来说，广播的规则有三个：\n",
    "\n",
    "1.  如果两个数组的维度数dim不相同，那么小维度数组的形状将会在左边补1\n",
    "\n",
    "2.  如果shape维度不匹配，但是有维度是1，那么可以扩展维度是1的维度匹配另一个数组；\n",
    "\n",
    "3.  如果shape维度不匹配，但是没有任何一个维度是1，则匹配引发错误；\n",
    "\n",
    "其中第1个和第2个规则，用于真实的计算中进行的试图匹配，规则3意思是不匹配的情况。  \n",
    "https://mp.weixin.qq.com/s?src=11&timestamp=1630417679&ver=3286&signature=QRM6ENv2ByPBulfYL1OlrYAda3RCNOGIqETiGjA65SHG5*65kiM9EJzzIFcIjBElB5S9uHUk8P24WK*xOwBVgEmDL3P1av*GhUVomJ1wZZixRHXGmJcwAhkvVvHC-Sj8&new=1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "2aa9da9b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0 1 2 3 4]\n",
      "[ 0  4  8 12 16]\n"
     ]
    }
   ],
   "source": [
    "#数组与标量值的乘法\n",
    "#标量值4被广播到了其他所有元素上\n",
    "import numpy as np\n",
    "arr = np.arange(5)\n",
    "print(arr) \n",
    "print(arr * 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "e81603ec",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.82725235  0.07331084  1.39418378]\n"
     ]
    }
   ],
   "source": [
    "#通过减去列平均值的方式对数组每一列进行距平化处理\n",
    "arr = np.random.randn(4,3)\n",
    "#print(arr)\n",
    "print(arr.mean(axis=0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "a63d4761",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a: [[1. 1. 1.]\n",
      " [1. 1. 1.]]\n",
      "b: [0 1 2]\n",
      "a+b: [[1. 2. 3.]\n",
      " [1. 2. 3.]]\n"
     ]
    }
   ],
   "source": [
    "#二维数组加一维数组\n",
    "a = np.ones((2,3))\n",
    "b = np.arange(3)\n",
    "print('a:',a)\n",
    "print('b:',b)\n",
    "print('a+b:',a+b)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74ef8901",
   "metadata": {},
   "source": [
    "按照规则进行分析：\n",
    "\n",
    "其中a.shape=(2, 3), b.shape=(3,)\n",
    "\n",
    "根据规则1，b.shape会变成(1, 3)\n",
    "\n",
    "根据规则2，b.shape再变成(2, 3)，相当于在行上复制\n",
    "\n",
    "完成匹配"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "e25ed87b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a: [[0]\n",
      " [1]\n",
      " [2]]\n",
      "b: [0 1 2]\n",
      "a+b: [[0 1 2]\n",
      " [1 2 3]\n",
      " [2 3 4]]\n"
     ]
    }
   ],
   "source": [
    "#两个数组均需要广播\n",
    "a = np.arange(3).reshape((3, 1))\n",
    "print('a:',a)\n",
    "b = np.arange(3)\n",
    "print('b:',b)\n",
    "print('a+b:',a+b)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4a21a86",
   "metadata": {},
   "source": [
    "根据规则进行分析：\n",
    "\n",
    "a.shape为(3,1)，b.shape为(3,)：\n",
    "\n",
    "根据规则1，b.shape会变成(1, 3)\n",
    "\n",
    "根据规则2，b.shape再变成(3, 3)，相当于在行上复制\n",
    "\n",
    "根据规则2，a.shape再变成(3, 3)，相当于在列上复制\n",
    "\n",
    "完成匹配"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "61bb17c1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a: [[1. 1.]\n",
      " [1. 1.]\n",
      " [1. 1.]]\n",
      "b: [0 1 2]\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (3,2) (3,) ",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-54-c09fe74a600a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'a:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'b:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'a+b:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;31m#报错\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,2) (3,) "
     ]
    }
   ],
   "source": [
    "#不匹配报错的例子\n",
    "a = np.ones((3,2))\n",
    "b = np.arange(3)\n",
    "print('a:',a)\n",
    "print('b:',b)\n",
    "print('a+b:',a+b)#报错"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c85fa485",
   "metadata": {},
   "source": [
    "按照规则分析：\n",
    "\n",
    "a.shape为(3,2)，b.shape为(3,)：\n",
    "\n",
    "根据规则1，b.shape会变成(1, 3)\n",
    "\n",
    "根据规则2，b.shape再变成(3, 3)，相当于在行上复制\n",
    "\n",
    "根据规则3，形状不匹配，但是没有维度是1，匹配失败报错"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9a9767d",
   "metadata": {},
   "source": [
    "使用矩阵乘法（@操作符）的例子"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "3d12f8b8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(64, 10, 10)"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.random.randn(64, 10, 5)\n",
    "y = np.random.randn(5, 10)\n",
    "(x @ y).shape # This works. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "a0d08164",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(64, 5, 5)"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(y @ x).shape # This also works. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "69b8b296",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(100, 10)"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.random.randn(100, 10, 5)\n",
    "y = np.random.randn(5)\n",
    "(x @ y).shape # What happens here?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84fc55a1",
   "metadata": {},
   "source": [
    "### Einsum\n",
    "\n",
    "A simple mechanism invented by Einstein. The code looks like `np.einsum('ijk, jkh -> ijh', x, y)`. To read this, following the following steps.\n",
    "\n",
    "1. Understanding the dimensionality of the input and output.\n",
    "2. Write out the existing dimensions.\n",
    "3. Sum over the 'missing' dimensions. \n",
    "\n",
    "For the example above, call the result $z$, then we have.\n",
    "\n",
    "1. $x$ is $I \\times J \\times K$, $y$ is $J \\times K \\times H$ and $z$ is $I \\times J \\times H$. \n",
    "2. To work out the expression, we have $z_{ijh} = ? x_{ij\\cdot} y_{j\\cdot h}.$\n",
    "3. Now since $k$ is missing from the r.h.s., we must have it in the sum. In other words $z_{ijh} = \\sum_{k=1}^K x_{ijk} y_{jkh}$. \n",
    "<br/>爱因斯坦求和约定是一种张量计算的领域特定语言，在 Numpy、PyTorch、TensorFlow中都存在。  \n",
    "简单的说就是省去求和式中的求和符号。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70f5c11b",
   "metadata": {},
   "source": [
    "Einsum入门操作  \n",
    "\n",
    "numpy.einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)\n",
    "\n",
    "enisum 中 subscrpts 参数的字符串形式有两种方式：\n",
    "\n",
    "1，implicit(隐式模式) 不包含-> 标识符和输出标签；输出数组会根据选择的下标顺序进行排序，例如 np.einsum('ij',a) 得到的二维数组无变化，但 np.einsum('ji',a) 需要对输出数组进行转置 (i ，j 轴互换)操作\n",
    "\n",
    "2，explicit(显式模式) 包含 标识符->及输出标签，能够增加函数的灵活性，例如调用 np.einsum('i->',a) 效果类似于 np.sum(a,axis = -1) ；而 np.einsum('ii->i',a) 等同于 np.diag(a) ；另外在显式模式中，会直接指定输出数组下标顺序，例如 np.einsum('ij,jh->ih',a,b) 表示矩阵相乘。\n",
    "\n",
    "https://numpy.org/devdocs/reference/generated/numpy.einsum.html  \n",
    "https://blog.csdn.net/weixin_42512684/article/details/112598472  \n",
    "https://zhuanlan.zhihu.com/p/44954540"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "79842e32",
   "metadata": {},
   "source": [
    "$c=\\sum_{i} a_{i} b_{i}$以einsum的写法就是$c=a_{i} b_{i}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "4afe262a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "60\n",
      "60\n",
      "60\n"
     ]
    }
   ],
   "source": [
    "#矩阵的迹Trace of a matrix\n",
    "a = np.arange(25).reshape(5,5)\n",
    "b = np.arange(5)\n",
    "c = np.arange(6).reshape(2,3)\n",
    "print(np.einsum('ii', a))\n",
    "print(np.einsum(a, [0,0]))\n",
    "print(np.trace(a))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "9f8e63c8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  6, 12, 18, 24])"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#提取对角阵Extract the diagonal (requires explicit form):\n",
    "np.einsum('ii->i', a)\n",
    "np.einsum(a, [0,0], [0])\n",
    "np.diag(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "482ac164",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 10,  35,  60,  85, 110])"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#沿某个维度求和Sum over an axis (requires explicit form):\n",
    "np.einsum('ij->i', a)\n",
    "np.einsum(a, [0,1], [0])\n",
    "np.sum(a, axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "ef6ba476",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 10,  35,  60,  85, 110])"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#高维矩阵沿某个维度求和\n",
    "#For higher dimensional arrays summing a single axis can be done with ellipsis\n",
    "#Ellipsis:当你的数组是高维的数组时，那么可以直接使用它来作为选取其他维度的等价写法\n",
    "np.einsum('...j->...', a)\n",
    "np.einsum(a, [Ellipsis,1], [Ellipsis])\n",
    "np.einsum(a, [...,1], [...])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "id": "d31d90f1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[10, 28, 46, 64],\n",
       "       [13, 40, 67, 94]])"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a1 = np.arange(6).reshape((3,2))\n",
    "b1 = np.arange(12).reshape((4,3))\n",
    "np.einsum('ki,jk->ij', a1, b1)\n",
    "np.einsum('ki,...k->i...', a1, b1)\n",
    "np.einsum('k...,jk', a1, b1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "id": "4c2d1d7a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 3],\n",
       "       [1, 4],\n",
       "       [2, 5]])"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#转置\n",
    "#Compute a matrix transpose, or reorder any number of axes:\n",
    "np.einsum('ji', c)\n",
    "np.einsum('ij->ji', c)\n",
    "np.einsum(c, [1,0])\n",
    "np.transpose(c)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "id": "5b730cf0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "30"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#向量内积\n",
    "#Vector inner products:\n",
    "np.einsum('i,i', b, b)\n",
    "np.einsum(b, [0], b, [0])\n",
    "np.inner(b,b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "id": "3868e943",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 30,  80, 130, 180, 230])"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#矩阵与向量乘法\n",
    "#Matrix vector multiplication\n",
    "np.einsum('ij,j', a, b)\n",
    "np.einsum(a, [0,1], b, [1])\n",
    "np.dot(a, b)\n",
    "np.einsum('...j,j', a, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "id": "7fbfeb27",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  3,  6],\n",
       "       [ 9, 12, 15]])"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#广播与标量乘法\n",
    "#Broadcasting and scalar multiplication\n",
    "np.einsum('..., ...', 3, c)\n",
    "np.einsum(',ij', 3, c)\n",
    "np.einsum(3, [Ellipsis], c, [Ellipsis])\n",
    "np.multiply(3, c)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "id": "9ef5cf9f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1 2]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2, 3, 4],\n",
       "       [0, 2, 4, 6, 8]])"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#向量外积\n",
    "#Vector outer product\n",
    "d=np.arange(2)+1\n",
    "print(d)\n",
    "np.einsum('i,j', d, b)\n",
    "np.einsum(d, [0], b, [1])\n",
    "np.outer(d, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "id": "a696ee92",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[4400., 4730.],\n",
       "       [4532., 4874.],\n",
       "       [4664., 5018.],\n",
       "       [4796., 5162.],\n",
       "       [4928., 5306.]])"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#张量收缩\n",
    "#Tensor contraction:\n",
    "a = np.arange(60.).reshape(3,4,5)\n",
    "b = np.arange(24.).reshape(4,3,2)\n",
    "np.einsum('ijk,jil->kl', a, b)\n",
    "np.einsum(a, [0,1,2], b, [1,0,3], [2,3])\n",
    "np.tensordot(a,b, axes=([1,0],[0,1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "id": "7d70d7d9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[4., 4., 4., 4., 4.],\n",
       "       [4., 4., 4., 4., 4.],\n",
       "       [4., 4., 4., 4., 4.],\n",
       "       [4., 4., 4., 4., 4.],\n",
       "       [4., 4., 4., 4., 4.],\n",
       "       [4., 4., 4., 4., 4.]])"
      ]
     },
     "execution_count": 96,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#矩阵相乘\n",
    "a1=np.ones((6,4))\n",
    "b1=np.ones((4,5))\n",
    "np.einsum('ij...,jk...->ik...',a1,b1)\n",
    "np.einsum('ij,jk->ik',a1,b1)\n",
    "np.matmul(a1,b1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a18b620",
   "metadata": {},
   "source": [
    "加速包：https://github.com/dgasmith/opt_einsum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "id": "e5fa6a3f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 15, 25)\n"
     ]
    }
   ],
   "source": [
    "#复杂例子\n",
    "I, J, K, H = 10, 15, 20, 25\n",
    "x = np.random.randn(I, J, K)\n",
    "y = np.random.randn(J, K, H)\n",
    "z = np.einsum('ijk, jkh -> ijh', x, y)\n",
    "print(z.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33a4e03c",
   "metadata": {},
   "source": [
    "### Conversion between Einsum and @\n",
    "\n",
    "---\n",
    "It is surprisingly difficult, although extremely useful, to try to convert einsum to (generalized) matrix multiplication and back. The latter is much harder, albeit much more benificial due to the fact that eimsum can often be poorly optimized, resulting in performance degradation. \n",
    "\n",
    "Let us start by converting @ to einsums.   \n",
    "einsum与矩阵乘法往往难以精确转换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "id": "4d60b754",
   "metadata": {},
   "outputs": [],
   "source": [
    " x = np.random.randn(64, 10, 5)\n",
    " y = np.random.randn(5, 10)\n",
    " z_at = x @ y"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8546ca28",
   "metadata": {},
   "source": [
    "Here is how we work out the exercise. \n",
    "\n",
    "1. Work out which dimension is broadcasted. In this case, $y$ has become $64 \\times 5 \\times 10$.\n",
    "2. Work out the final dimension. In this case, it should be $64 \\times 10 \\times 10$. \n",
    "2. Work out how the matrix multiplication actually happens. In this case, it happens to the last trailing dimensions. \n",
    "3. Ignore the broadcast for a moment. Just write out the plain einsum for matrix mulitplication. `eimsum('ij, jk -> ik', x, y)`. \n",
    "4. Start from there and fill in the missing replicated part. **Be extremely careful about the dimensions that seem to be equal but are not.** In this case, there are two $10$ in the expression, but they are not **the same thing**!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "id": "1515375d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.5692347500560786e-14"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z_ein = np.einsum('bij, jk -> bik', x, y)\n",
    "np.linalg.norm(z_at - z_ein) # They are not going to be exactly equal thanks for numerical errors."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45cbff7d",
   "metadata": {},
   "source": [
    "To convert einsum to @ is much harder, and is not in general (easily) possible.\n",
    "\n",
    "There is no fixed routine to perform the tasks (since sometimes it is not even possible). However, it is very important to write out the sum and looks for the parts that looks like a matrix manipulation (if there is one). In general,  𝑋=𝐴𝐵  means  𝑥𝑖𝑘=∑𝑗𝑎𝑖𝑗𝑏𝑗𝑘 . Therefore, once the latter occurs, just choose the dimension to conform to the matrix multiplication. When there are more summation in the einsum, it is better to stack the tensors along the dimensions so that there is only one sum.\n",
    "\n",
    "We will not delve too deep into the topics. Let us find the equivalent form of np.einsum('ijk, jkh -> ijh', x, y).\n",
    "\n",
    "Let us look write the expression  𝑧𝑖𝑗ℎ=∑𝐾𝑘=1𝑥𝑖𝑗𝑘𝑦𝑗𝑘ℎ  as  𝑧𝑖𝑗𝑘=𝑥𝑡𝑖𝑗𝑦𝑗ℎ , where  𝑥𝑖𝑗  and  𝑦𝑗ℎ  are vectors.\n",
    "Now to get what we want, we have  𝑥̃   to be a  𝐼𝐽×𝐾  and  𝑦̃   to be a  𝐾×𝐽𝐻  matrices. Now we can perform the matrix multiplication.\n",
    "The only problem is that we get a tensor that is essentially  𝐼×𝐽×𝐽×𝐻  and therefore somehow we get an additional  𝐽 . The understand what this means, note that the general element we will have  𝑥𝑡𝑖𝑗𝑦𝑗′ℎ , namely in the original situation we only compute things for  𝑗=𝑗′  and when writing it out be matmul we have incur quite some heavy additional computations.\n",
    "To avoid the above overhead, the idea is to move  𝐽  to the broadcast dimension, that is we wnat a  𝐽×𝐼×𝐾  matrices and a  𝐽×𝐾×𝐻  matrices. You might have notice this solution quite early, but the reasoning shows that it is far from trivial to perform such tasks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "id": "60d784d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "I, J, K, H = 10, 15, 20, 25\n",
    "x = np.random.randn(I, J, K)\n",
    "y = np.random.randn(J, K, H)\n",
    "z = np.einsum('ijk, jkh -> ijh', x, y)\n",
    "\n",
    "z_bc = rearrange(rearrange(x, 'i j k -> j i k') @ y, 'j i h -> i j h')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "id": "e78abb92",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.022013082402752e-14"
      ]
     },
     "execution_count": 101,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.linalg.norm(z-z_bc)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42c9ad95",
   "metadata": {},
   "source": [
    "## Sparse matrices\n",
    "---\n",
    "By default, numpy doesn't support sparse matrix operations. If one wishes to use sparse matrices, one must use `scipy.sparse`.  \n",
    "稀疏矩阵"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77483149",
   "metadata": {},
   "source": [
    "coo_matrix是最简单的稀疏矩阵存储方式，采用三元组(row, col, data)(或称为ijv format)的形式来存储矩阵中非零元素的信息。在实际使用中，一般coo_matrix用来创建矩阵，因为coo_matrix无法对矩阵的元素进行增删改操作；创建成功之后可以转化成其他格式的稀疏矩阵（如csr_matrix、csc_matrix）进行转置、矩阵乘法等操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "id": "684ec638",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[4, 0, 9, 0],\n",
       "       [0, 7, 0, 0],\n",
       "       [0, 0, 0, 0],\n",
       "       [0, 0, 0, 5]])"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "from scipy.sparse import coo_matrix\n",
    "\n",
    "row  = np.array([0, 3, 1, 0])\n",
    "col  = np.array([0, 3, 1, 2])\n",
    "data = np.array([4, 5, 7, 9])\n",
    "coo_matrix((data, (row, col)), shape=(4, 4)).toarray()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "id": "5c933b32",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 0]\n",
      " [0 0 3]\n",
      " [1 0 4]]\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[1,2,0],[0,0,3],[1,0,4]])\n",
    "coo_matrix(A)\n",
    "print(coo_matrix(A).toarray())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "id": "9864228f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  (0, 2)\t6\n",
      "  (0, 1)\t2\n",
      "  (0, 0)\t1\n",
      "  (1, 2)\t12\n",
      "  (1, 0)\t3\n",
      "  (2, 2)\t16\n",
      "  (2, 1)\t2\n",
      "  (2, 0)\t5\n"
     ]
    }
   ],
   "source": [
    "print(coo_matrix(A).dot(coo_matrix(A)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b5d2b230",
   "metadata": {},
   "source": [
    "### Pandas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "id": "ee7aca1e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cloning into 'cardashians'...\n",
      "remote: Enumerating objects: 64, done.\u001b[K\n",
      "remote: Total 64 (delta 0), reused 0 (delta 0), pack-reused 64\u001b[K\n",
      "Receiving objects: 100% (64/64), 15.99 MiB | 1.91 MiB/s, done.\n",
      "Resolving deltas: 100% (31/31), done.\n"
     ]
    }
   ],
   "source": [
    "#下载一个数据集便于演示：github不稳定，有时需要运行多次\n",
    "!git clone https://github.com/tolarteh/cardashians.git"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 141,
   "id": "d3c38e35",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "#数据读取，注意engine参数,修index设为默认从0开始避免错位\n",
    "x_train = pd.read_csv('./cardashians/xtrain.csv', engine='python',index_col=False )\n",
    "x_test = pd.read_csv('./cardashians/xtest.csv', engine='python',index_col=False )\n",
    "\n",
    "y_train = pd.read_csv('./cardashians/ytrain.csv', engine='python',index_col=False )\n",
    "y_test = pd.read_csv('./cardashians/ytest.csv', engine='python',index_col=False )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 142,
   "id": "979edfb6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "72983"
      ]
     },
     "execution_count": 142,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#获取样本总量\n",
    "TRAIN_IDX=x_train.shape[0]\n",
    "TEST_IDX = TRAIN_IDX + x_test.shape[0]\n",
    "TEST_IDX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 148,
   "id": "a110de21",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>VehicleAge</th>\n",
       "      <th>VehOdo</th>\n",
       "      <th>MMRAcquisitionAuctionAveragePrice</th>\n",
       "      <th>MMRAcquisitionAuctionCleanPrice</th>\n",
       "      <th>MMRAcquisitionRetailAveragePrice</th>\n",
       "      <th>MMRAcquisitonRetailCleanPrice</th>\n",
       "      <th>MMRCurrentAuctionAveragePrice</th>\n",
       "      <th>MMRCurrentAuctionCleanPrice</th>\n",
       "      <th>MMRCurrentRetailAveragePrice</th>\n",
       "      <th>MMRCurrentRetailCleanPrice</th>\n",
       "      <th>...</th>\n",
       "      <th>VNST_PA</th>\n",
       "      <th>VNST_SC</th>\n",
       "      <th>VNST_TN</th>\n",
       "      <th>VNST_TX</th>\n",
       "      <th>VNST_UT</th>\n",
       "      <th>VNST_VA</th>\n",
       "      <th>VNST_WA</th>\n",
       "      <th>VNST_WI</th>\n",
       "      <th>VNST_WV</th>\n",
       "      <th>IsBadBuy</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>75645</td>\n",
       "      <td>8337.0</td>\n",
       "      <td>9645.0</td>\n",
       "      <td>12659.0</td>\n",
       "      <td>13621.0</td>\n",
       "      <td>8201.0</td>\n",
       "      <td>9792.0</td>\n",
       "      <td>12002.0</td>\n",
       "      <td>13607.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4</td>\n",
       "      <td>84095</td>\n",
       "      <td>8829.0</td>\n",
       "      <td>10387.0</td>\n",
       "      <td>10035.0</td>\n",
       "      <td>11718.0</td>\n",
       "      <td>8987.0</td>\n",
       "      <td>10565.0</td>\n",
       "      <td>10206.0</td>\n",
       "      <td>11910.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>51780</td>\n",
       "      <td>6003.0</td>\n",
       "      <td>6904.0</td>\n",
       "      <td>6983.0</td>\n",
       "      <td>7956.0</td>\n",
       "      <td>6517.0</td>\n",
       "      <td>7437.0</td>\n",
       "      <td>7538.0</td>\n",
       "      <td>8532.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6</td>\n",
       "      <td>73746</td>\n",
       "      <td>5438.0</td>\n",
       "      <td>6804.0</td>\n",
       "      <td>8608.0</td>\n",
       "      <td>9687.0</td>\n",
       "      <td>4086.0</td>\n",
       "      <td>5559.0</td>\n",
       "      <td>7708.0</td>\n",
       "      <td>8821.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7</td>\n",
       "      <td>81897</td>\n",
       "      <td>1911.0</td>\n",
       "      <td>2753.0</td>\n",
       "      <td>2564.0</td>\n",
       "      <td>3473.0</td>\n",
       "      <td>1367.0</td>\n",
       "      <td>2128.0</td>\n",
       "      <td>1976.0</td>\n",
       "      <td>2798.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 145 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   VehicleAge  VehOdo  MMRAcquisitionAuctionAveragePrice  \\\n",
       "0           2   75645                             8337.0   \n",
       "1           4   84095                             8829.0   \n",
       "2           3   51780                             6003.0   \n",
       "3           6   73746                             5438.0   \n",
       "4           7   81897                             1911.0   \n",
       "\n",
       "   MMRAcquisitionAuctionCleanPrice  MMRAcquisitionRetailAveragePrice  \\\n",
       "0                           9645.0                           12659.0   \n",
       "1                          10387.0                           10035.0   \n",
       "2                           6904.0                            6983.0   \n",
       "3                           6804.0                            8608.0   \n",
       "4                           2753.0                            2564.0   \n",
       "\n",
       "   MMRAcquisitonRetailCleanPrice  MMRCurrentAuctionAveragePrice  \\\n",
       "0                        13621.0                         8201.0   \n",
       "1                        11718.0                         8987.0   \n",
       "2                         7956.0                         6517.0   \n",
       "3                         9687.0                         4086.0   \n",
       "4                         3473.0                         1367.0   \n",
       "\n",
       "   MMRCurrentAuctionCleanPrice  MMRCurrentRetailAveragePrice  \\\n",
       "0                       9792.0                       12002.0   \n",
       "1                      10565.0                       10206.0   \n",
       "2                       7437.0                        7538.0   \n",
       "3                       5559.0                        7708.0   \n",
       "4                       2128.0                        1976.0   \n",
       "\n",
       "   MMRCurrentRetailCleanPrice  ...  VNST_PA  VNST_SC  VNST_TN  VNST_TX  \\\n",
       "0                     13607.0  ...        0        0        0        0   \n",
       "1                     11910.0  ...        0        0        0        1   \n",
       "2                      8532.0  ...        0        0        0        0   \n",
       "3                      8821.0  ...        0        0        0        0   \n",
       "4                      2798.0  ...        0        1        0        0   \n",
       "\n",
       "   VNST_UT  VNST_VA  VNST_WA  VNST_WI  VNST_WV  IsBadBuy  \n",
       "0        0        0        0        0        0         0  \n",
       "1        0        0        0        0        0         0  \n",
       "2        0        1        0        0        0         0  \n",
       "3        0        0        0        0        0         0  \n",
       "4        0        0        0        0        0         0  \n",
       "\n",
       "[5 rows x 145 columns]"
      ]
     },
     "execution_count": 148,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#拼接\n",
    "x = pd.concat([x_train, x_test], axis=0)\n",
    "y = pd.concat([y_train, y_test], axis=0)\n",
    "\n",
    "data = pd.concat([x, y], axis=1)\n",
    "#重置索引避免重复\n",
    "data = data.reset_index(drop = True)\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 150,
   "id": "a302a9d8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['VehicleAge',\n",
       " 'VehOdo',\n",
       " 'MMRAcquisitionAuctionAveragePrice',\n",
       " 'MMRAcquisitionAuctionCleanPrice',\n",
       " 'MMRAcquisitionRetailAveragePrice',\n",
       " 'MMRAcquisitonRetailCleanPrice',\n",
       " 'MMRCurrentAuctionAveragePrice',\n",
       " 'MMRCurrentAuctionCleanPrice',\n",
       " 'MMRCurrentRetailAveragePrice',\n",
       " 'MMRCurrentRetailCleanPrice',\n",
       " 'BYRNO',\n",
       " 'VehBCost',\n",
       " 'IsOnlineSale',\n",
       " 'WarrantyCost',\n",
       " 'Auction_ADESA',\n",
       " 'Auction_MANHEIM',\n",
       " 'Auction_OTHER',\n",
       " 'Make_ACURA',\n",
       " 'Make_BUICK',\n",
       " 'Make_CADILLAC',\n",
       " 'Make_CHEVROLET',\n",
       " 'Make_CHRYSLER',\n",
       " 'Make_DODGE',\n",
       " 'Make_FORD',\n",
       " 'Make_GMC',\n",
       " 'Make_HONDA',\n",
       " 'Make_HUMMER',\n",
       " 'Make_HYUNDAI',\n",
       " 'Make_INFINITI',\n",
       " 'Make_ISUZU',\n",
       " 'Make_JEEP',\n",
       " 'Make_KIA',\n",
       " 'Make_LEXUS',\n",
       " 'Make_LINCOLN',\n",
       " 'Make_MAZDA',\n",
       " 'Make_MERCURY',\n",
       " 'Make_MINI',\n",
       " 'Make_MITSUBISHI',\n",
       " 'Make_NISSAN',\n",
       " 'Make_OLDSMOBILE',\n",
       " 'Make_PLYMOUTH',\n",
       " 'Make_PONTIAC',\n",
       " 'Make_SATURN',\n",
       " 'Make_SCION',\n",
       " 'Make_SUBARU',\n",
       " 'Make_SUZUKI',\n",
       " 'Make_TOYOTA',\n",
       " 'Make_TOYOTA SCION',\n",
       " 'Make_VOLKSWAGEN',\n",
       " 'Make_VOLVO',\n",
       " 'Color_BEIGE',\n",
       " 'Color_BLACK',\n",
       " 'Color_BLUE',\n",
       " 'Color_BROWN',\n",
       " 'Color_GOLD',\n",
       " 'Color_GREEN',\n",
       " 'Color_GREY',\n",
       " 'Color_MAROON',\n",
       " 'Color_NOT AVAIL',\n",
       " 'Color_ORANGE',\n",
       " 'Color_OTHER',\n",
       " 'Color_PINK',\n",
       " 'Color_PURPLE',\n",
       " 'Color_RED',\n",
       " 'Color_SILVER',\n",
       " 'Color_U0',\n",
       " 'Color_WHITE',\n",
       " 'Color_YELLOW',\n",
       " 'Transmission_AUTO',\n",
       " 'Transmission_MANUAL',\n",
       " 'Transmission_Manual',\n",
       " 'Transmission_U0',\n",
       " 'WheelTypeID_0.0',\n",
       " 'WheelTypeID_1.0',\n",
       " 'WheelTypeID_2.0',\n",
       " 'WheelTypeID_3.0',\n",
       " 'WheelTypeID_U0',\n",
       " 'Nationality_AMERICAN',\n",
       " 'Nationality_OTHER',\n",
       " 'Nationality_OTHER ASIAN',\n",
       " 'Nationality_TOP LINE ASIAN',\n",
       " 'Nationality_U0',\n",
       " 'Size_COMPACT',\n",
       " 'Size_CROSSOVER',\n",
       " 'Size_LARGE',\n",
       " 'Size_LARGE SUV',\n",
       " 'Size_LARGE TRUCK',\n",
       " 'Size_MEDIUM',\n",
       " 'Size_MEDIUM SUV',\n",
       " 'Size_SMALL SUV',\n",
       " 'Size_SMALL TRUCK',\n",
       " 'Size_SPECIALTY',\n",
       " 'Size_SPORTS',\n",
       " 'Size_U0',\n",
       " 'Size_VAN',\n",
       " 'TopThreeAmericanName_CHRYSLER',\n",
       " 'TopThreeAmericanName_FORD',\n",
       " 'TopThreeAmericanName_GM',\n",
       " 'TopThreeAmericanName_OTHER',\n",
       " 'TopThreeAmericanName_U0',\n",
       " 'PRIMEUNIT_NO',\n",
       " 'PRIMEUNIT_U0',\n",
       " 'PRIMEUNIT_YES',\n",
       " 'AUCGUART_GREEN',\n",
       " 'AUCGUART_RED',\n",
       " 'AUCGUART_U0',\n",
       " 'VNST_AL',\n",
       " 'VNST_AR',\n",
       " 'VNST_AZ',\n",
       " 'VNST_CA',\n",
       " 'VNST_CO',\n",
       " 'VNST_FL',\n",
       " 'VNST_GA',\n",
       " 'VNST_IA',\n",
       " 'VNST_ID',\n",
       " 'VNST_IL',\n",
       " 'VNST_IN',\n",
       " 'VNST_KY',\n",
       " 'VNST_LA',\n",
       " 'VNST_MA',\n",
       " 'VNST_MD',\n",
       " 'VNST_MI',\n",
       " 'VNST_MN',\n",
       " 'VNST_MO',\n",
       " 'VNST_MS',\n",
       " 'VNST_NC',\n",
       " 'VNST_NE',\n",
       " 'VNST_NH',\n",
       " 'VNST_NJ',\n",
       " 'VNST_NM',\n",
       " 'VNST_NV',\n",
       " 'VNST_NY',\n",
       " 'VNST_OH',\n",
       " 'VNST_OK',\n",
       " 'VNST_OR',\n",
       " 'VNST_PA',\n",
       " 'VNST_SC',\n",
       " 'VNST_TN',\n",
       " 'VNST_TX',\n",
       " 'VNST_UT',\n",
       " 'VNST_VA',\n",
       " 'VNST_WA',\n",
       " 'VNST_WI',\n",
       " 'VNST_WV',\n",
       " 'IsBadBuy']"
      ]
     },
     "execution_count": 150,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#列名\n",
    "data.columns.to_list()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 151,
   "id": "31200e21",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0 1]\n",
      "0    64007\n",
      "1     8976\n",
      "Name: IsBadBuy, dtype: int64\n"
     ]
    }
   ],
   "source": [
    "#获取不重复值\n",
    "print(data['IsBadBuy'].unique())\n",
    "#统计频次\n",
    "print(data['IsBadBuy'].value_counts())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 152,
   "id": "0a431d6b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "45503"
      ]
     },
     "execution_count": 152,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#数据筛选\n",
    "#直接索引\n",
    "data['VehicleAge'][data['VehicleAge']>3].count()\n",
    "#loc：通过标签选取数据，即通过index和columns的值进行选取\n",
    "data.loc[data['VehicleAge']>3, 'VehicleAge'].count()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 153,
   "id": "8025d01e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>VehOdo</th>\n",
       "      <th>VehicleAge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>84095</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>73746</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>81897</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>76429</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>73260</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72975</th>\n",
       "      <td>72266</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72976</th>\n",
       "      <td>76158</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72977</th>\n",
       "      <td>84025</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72981</th>\n",
       "      <td>54282</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72982</th>\n",
       "      <td>79306</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>45503 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       VehOdo  VehicleAge\n",
       "1       84095           4\n",
       "3       73746           6\n",
       "4       81897           7\n",
       "8       76429           7\n",
       "10      73260           6\n",
       "...       ...         ...\n",
       "72975   72266           6\n",
       "72976   76158           5\n",
       "72977   84025           5\n",
       "72981   54282           5\n",
       "72982   79306           6\n",
       "\n",
       "[45503 rows x 2 columns]"
      ]
     },
     "execution_count": 153,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.loc[(data['VehOdo'] > 5000)&(data['VehicleAge']>3), ['VehOdo', 'VehicleAge']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 154,
   "id": "67a874b8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>VehicleAge</th>\n",
       "      <th>VehOdo</th>\n",
       "      <th>MMRAcquisitionAuctionAveragePrice</th>\n",
       "      <th>MMRAcquisitionAuctionCleanPrice</th>\n",
       "      <th>MMRAcquisitionRetailAveragePrice</th>\n",
       "      <th>MMRAcquisitonRetailCleanPrice</th>\n",
       "      <th>MMRCurrentAuctionAveragePrice</th>\n",
       "      <th>MMRCurrentAuctionCleanPrice</th>\n",
       "      <th>MMRCurrentRetailAveragePrice</th>\n",
       "      <th>MMRCurrentRetailCleanPrice</th>\n",
       "      <th>...</th>\n",
       "      <th>VNST_PA</th>\n",
       "      <th>VNST_SC</th>\n",
       "      <th>VNST_TN</th>\n",
       "      <th>VNST_TX</th>\n",
       "      <th>VNST_UT</th>\n",
       "      <th>VNST_VA</th>\n",
       "      <th>VNST_WA</th>\n",
       "      <th>VNST_WI</th>\n",
       "      <th>VNST_WV</th>\n",
       "      <th>IsBadBuy</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>147</th>\n",
       "      <td>2</td>\n",
       "      <td>12628</td>\n",
       "      <td>5657.0</td>\n",
       "      <td>6350.0</td>\n",
       "      <td>6610.0</td>\n",
       "      <td>7358.0</td>\n",
       "      <td>5617.0</td>\n",
       "      <td>6487.0</td>\n",
       "      <td>6566.0</td>\n",
       "      <td>7506.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2876</th>\n",
       "      <td>1</td>\n",
       "      <td>13445</td>\n",
       "      <td>19546.0</td>\n",
       "      <td>20809.0</td>\n",
       "      <td>23361.0</td>\n",
       "      <td>24870.0</td>\n",
       "      <td>20817.0</td>\n",
       "      <td>21601.0</td>\n",
       "      <td>24286.0</td>\n",
       "      <td>25060.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6111</th>\n",
       "      <td>1</td>\n",
       "      <td>19610</td>\n",
       "      <td>14539.0</td>\n",
       "      <td>15841.0</td>\n",
       "      <td>20008.0</td>\n",
       "      <td>21662.0</td>\n",
       "      <td>17343.0</td>\n",
       "      <td>18942.0</td>\n",
       "      <td>20291.0</td>\n",
       "      <td>22079.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7564</th>\n",
       "      <td>4</td>\n",
       "      <td>14547</td>\n",
       "      <td>3641.0</td>\n",
       "      <td>4480.0</td>\n",
       "      <td>4432.0</td>\n",
       "      <td>5338.0</td>\n",
       "      <td>4334.0</td>\n",
       "      <td>5392.0</td>\n",
       "      <td>5181.0</td>\n",
       "      <td>6323.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7977</th>\n",
       "      <td>1</td>\n",
       "      <td>10643</td>\n",
       "      <td>6217.0</td>\n",
       "      <td>7325.0</td>\n",
       "      <td>7214.0</td>\n",
       "      <td>8411.0</td>\n",
       "      <td>6740.0</td>\n",
       "      <td>7937.0</td>\n",
       "      <td>7779.0</td>\n",
       "      <td>9072.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11084</th>\n",
       "      <td>5</td>\n",
       "      <td>17538</td>\n",
       "      <td>2824.0</td>\n",
       "      <td>4200.0</td>\n",
       "      <td>3550.0</td>\n",
       "      <td>5036.0</td>\n",
       "      <td>3015.0</td>\n",
       "      <td>3879.0</td>\n",
       "      <td>6214.0</td>\n",
       "      <td>7007.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12094</th>\n",
       "      <td>6</td>\n",
       "      <td>15655</td>\n",
       "      <td>4281.0</td>\n",
       "      <td>5752.0</td>\n",
       "      <td>8606.0</td>\n",
       "      <td>9835.0</td>\n",
       "      <td>3953.0</td>\n",
       "      <td>5318.0</td>\n",
       "      <td>7761.0</td>\n",
       "      <td>9057.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13092</th>\n",
       "      <td>2</td>\n",
       "      <td>10095</td>\n",
       "      <td>32250.0</td>\n",
       "      <td>35215.0</td>\n",
       "      <td>35330.0</td>\n",
       "      <td>38532.0</td>\n",
       "      <td>32250.0</td>\n",
       "      <td>35215.0</td>\n",
       "      <td>35330.0</td>\n",
       "      <td>38532.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14246</th>\n",
       "      <td>8</td>\n",
       "      <td>19070</td>\n",
       "      <td>3919.0</td>\n",
       "      <td>5867.0</td>\n",
       "      <td>6134.0</td>\n",
       "      <td>7433.0</td>\n",
       "      <td>3828.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>6614.0</td>\n",
       "      <td>7872.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17491</th>\n",
       "      <td>6</td>\n",
       "      <td>5368</td>\n",
       "      <td>2617.0</td>\n",
       "      <td>3512.0</td>\n",
       "      <td>3326.0</td>\n",
       "      <td>4293.0</td>\n",
       "      <td>2690.0</td>\n",
       "      <td>3375.0</td>\n",
       "      <td>3405.0</td>\n",
       "      <td>4145.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19000</th>\n",
       "      <td>5</td>\n",
       "      <td>19430</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20035</th>\n",
       "      <td>3</td>\n",
       "      <td>15894</td>\n",
       "      <td>4066.0</td>\n",
       "      <td>4852.0</td>\n",
       "      <td>4891.0</td>\n",
       "      <td>5740.0</td>\n",
       "      <td>4066.0</td>\n",
       "      <td>4852.0</td>\n",
       "      <td>4891.0</td>\n",
       "      <td>5740.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27551</th>\n",
       "      <td>3</td>\n",
       "      <td>15249</td>\n",
       "      <td>5427.0</td>\n",
       "      <td>6508.0</td>\n",
       "      <td>6361.0</td>\n",
       "      <td>7529.0</td>\n",
       "      <td>5540.0</td>\n",
       "      <td>6569.0</td>\n",
       "      <td>6483.0</td>\n",
       "      <td>7595.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28299</th>\n",
       "      <td>5</td>\n",
       "      <td>14187</td>\n",
       "      <td>3498.0</td>\n",
       "      <td>4570.0</td>\n",
       "      <td>4278.0</td>\n",
       "      <td>5436.0</td>\n",
       "      <td>2856.0</td>\n",
       "      <td>3976.0</td>\n",
       "      <td>3584.0</td>\n",
       "      <td>4794.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36221</th>\n",
       "      <td>7</td>\n",
       "      <td>18440</td>\n",
       "      <td>2622.0</td>\n",
       "      <td>3740.0</td>\n",
       "      <td>3332.0</td>\n",
       "      <td>4539.0</td>\n",
       "      <td>2622.0</td>\n",
       "      <td>3740.0</td>\n",
       "      <td>3332.0</td>\n",
       "      <td>4539.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36699</th>\n",
       "      <td>3</td>\n",
       "      <td>16545</td>\n",
       "      <td>4801.0</td>\n",
       "      <td>5679.0</td>\n",
       "      <td>5685.0</td>\n",
       "      <td>6633.0</td>\n",
       "      <td>4396.0</td>\n",
       "      <td>5208.0</td>\n",
       "      <td>5248.0</td>\n",
       "      <td>6125.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40461</th>\n",
       "      <td>2</td>\n",
       "      <td>18813</td>\n",
       "      <td>5824.0</td>\n",
       "      <td>7192.0</td>\n",
       "      <td>6790.0</td>\n",
       "      <td>8267.0</td>\n",
       "      <td>5824.0</td>\n",
       "      <td>7192.0</td>\n",
       "      <td>6790.0</td>\n",
       "      <td>8267.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42083</th>\n",
       "      <td>1</td>\n",
       "      <td>9446</td>\n",
       "      <td>27680.0</td>\n",
       "      <td>30114.0</td>\n",
       "      <td>31599.0</td>\n",
       "      <td>33736.0</td>\n",
       "      <td>27795.0</td>\n",
       "      <td>30136.0</td>\n",
       "      <td>31128.0</td>\n",
       "      <td>33014.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42734</th>\n",
       "      <td>4</td>\n",
       "      <td>9878</td>\n",
       "      <td>3880.0</td>\n",
       "      <td>4795.0</td>\n",
       "      <td>7030.0</td>\n",
       "      <td>8244.0</td>\n",
       "      <td>3136.0</td>\n",
       "      <td>4160.0</td>\n",
       "      <td>6380.0</td>\n",
       "      <td>7925.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45365</th>\n",
       "      <td>2</td>\n",
       "      <td>19751</td>\n",
       "      <td>10956.0</td>\n",
       "      <td>12651.0</td>\n",
       "      <td>12332.0</td>\n",
       "      <td>14163.0</td>\n",
       "      <td>11155.0</td>\n",
       "      <td>12897.0</td>\n",
       "      <td>12547.0</td>\n",
       "      <td>14429.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45583</th>\n",
       "      <td>2</td>\n",
       "      <td>18923</td>\n",
       "      <td>5582.0</td>\n",
       "      <td>6339.0</td>\n",
       "      <td>6529.0</td>\n",
       "      <td>7346.0</td>\n",
       "      <td>5667.0</td>\n",
       "      <td>6367.0</td>\n",
       "      <td>6620.0</td>\n",
       "      <td>7376.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46703</th>\n",
       "      <td>5</td>\n",
       "      <td>14474</td>\n",
       "      <td>4800.0</td>\n",
       "      <td>5892.0</td>\n",
       "      <td>5684.0</td>\n",
       "      <td>6863.0</td>\n",
       "      <td>4876.0</td>\n",
       "      <td>5856.0</td>\n",
       "      <td>5766.0</td>\n",
       "      <td>6824.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46803</th>\n",
       "      <td>1</td>\n",
       "      <td>17502</td>\n",
       "      <td>35722.0</td>\n",
       "      <td>36859.0</td>\n",
       "      <td>39080.0</td>\n",
       "      <td>40308.0</td>\n",
       "      <td>35722.0</td>\n",
       "      <td>36859.0</td>\n",
       "      <td>39080.0</td>\n",
       "      <td>40308.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48345</th>\n",
       "      <td>5</td>\n",
       "      <td>19200</td>\n",
       "      <td>4041.0</td>\n",
       "      <td>4991.0</td>\n",
       "      <td>7360.0</td>\n",
       "      <td>8241.0</td>\n",
       "      <td>3956.0</td>\n",
       "      <td>5032.0</td>\n",
       "      <td>7592.0</td>\n",
       "      <td>8498.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49287</th>\n",
       "      <td>5</td>\n",
       "      <td>18777</td>\n",
       "      <td>2463.0</td>\n",
       "      <td>3276.0</td>\n",
       "      <td>3160.0</td>\n",
       "      <td>4038.0</td>\n",
       "      <td>2994.0</td>\n",
       "      <td>3629.0</td>\n",
       "      <td>3734.0</td>\n",
       "      <td>4419.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49610</th>\n",
       "      <td>2</td>\n",
       "      <td>12926</td>\n",
       "      <td>6324.0</td>\n",
       "      <td>7586.0</td>\n",
       "      <td>7330.0</td>\n",
       "      <td>8693.0</td>\n",
       "      <td>6914.0</td>\n",
       "      <td>7776.0</td>\n",
       "      <td>7967.0</td>\n",
       "      <td>8898.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50021</th>\n",
       "      <td>2</td>\n",
       "      <td>11662</td>\n",
       "      <td>19250.0</td>\n",
       "      <td>23021.0</td>\n",
       "      <td>21290.0</td>\n",
       "      <td>25363.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50367</th>\n",
       "      <td>3</td>\n",
       "      <td>15409</td>\n",
       "      <td>16536.0</td>\n",
       "      <td>18384.0</td>\n",
       "      <td>20976.0</td>\n",
       "      <td>23011.0</td>\n",
       "      <td>16988.0</td>\n",
       "      <td>18762.0</td>\n",
       "      <td>21431.0</td>\n",
       "      <td>23285.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51978</th>\n",
       "      <td>2</td>\n",
       "      <td>15743</td>\n",
       "      <td>5828.0</td>\n",
       "      <td>6720.0</td>\n",
       "      <td>9332.0</td>\n",
       "      <td>10761.0</td>\n",
       "      <td>5700.0</td>\n",
       "      <td>6702.0</td>\n",
       "      <td>9249.0</td>\n",
       "      <td>10140.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>53834</th>\n",
       "      <td>2</td>\n",
       "      <td>13924</td>\n",
       "      <td>32063.0</td>\n",
       "      <td>35108.0</td>\n",
       "      <td>37885.0</td>\n",
       "      <td>41482.0</td>\n",
       "      <td>31127.0</td>\n",
       "      <td>34798.0</td>\n",
       "      <td>38151.0</td>\n",
       "      <td>41062.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>53979</th>\n",
       "      <td>5</td>\n",
       "      <td>19825</td>\n",
       "      <td>2941.0</td>\n",
       "      <td>4054.0</td>\n",
       "      <td>6188.0</td>\n",
       "      <td>8529.0</td>\n",
       "      <td>3439.0</td>\n",
       "      <td>4353.0</td>\n",
       "      <td>6713.0</td>\n",
       "      <td>7833.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55574</th>\n",
       "      <td>5</td>\n",
       "      <td>16574</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55789</th>\n",
       "      <td>7</td>\n",
       "      <td>19983</td>\n",
       "      <td>2622.0</td>\n",
       "      <td>3740.0</td>\n",
       "      <td>3332.0</td>\n",
       "      <td>4539.0</td>\n",
       "      <td>2622.0</td>\n",
       "      <td>3740.0</td>\n",
       "      <td>3332.0</td>\n",
       "      <td>4539.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57292</th>\n",
       "      <td>6</td>\n",
       "      <td>19534</td>\n",
       "      <td>2560.0</td>\n",
       "      <td>3765.0</td>\n",
       "      <td>3265.0</td>\n",
       "      <td>4566.0</td>\n",
       "      <td>2560.0</td>\n",
       "      <td>3765.0</td>\n",
       "      <td>3265.0</td>\n",
       "      <td>4566.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>58356</th>\n",
       "      <td>1</td>\n",
       "      <td>8706</td>\n",
       "      <td>8440.0</td>\n",
       "      <td>8841.0</td>\n",
       "      <td>11948.0</td>\n",
       "      <td>12346.0</td>\n",
       "      <td>8601.0</td>\n",
       "      <td>9104.0</td>\n",
       "      <td>11739.0</td>\n",
       "      <td>12276.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62083</th>\n",
       "      <td>1</td>\n",
       "      <td>19137</td>\n",
       "      <td>9335.0</td>\n",
       "      <td>10203.0</td>\n",
       "      <td>10582.0</td>\n",
       "      <td>11519.0</td>\n",
       "      <td>9335.0</td>\n",
       "      <td>10203.0</td>\n",
       "      <td>10582.0</td>\n",
       "      <td>11519.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62436</th>\n",
       "      <td>7</td>\n",
       "      <td>14961</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62557</th>\n",
       "      <td>5</td>\n",
       "      <td>16713</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>64223</th>\n",
       "      <td>6</td>\n",
       "      <td>17281</td>\n",
       "      <td>4281.0</td>\n",
       "      <td>5752.0</td>\n",
       "      <td>8606.0</td>\n",
       "      <td>9835.0</td>\n",
       "      <td>4736.0</td>\n",
       "      <td>5987.0</td>\n",
       "      <td>8223.0</td>\n",
       "      <td>9902.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>68169</th>\n",
       "      <td>6</td>\n",
       "      <td>14519</td>\n",
       "      <td>1683.0</td>\n",
       "      <td>2473.0</td>\n",
       "      <td>2318.0</td>\n",
       "      <td>3171.0</td>\n",
       "      <td>1683.0</td>\n",
       "      <td>2473.0</td>\n",
       "      <td>2318.0</td>\n",
       "      <td>3171.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71098</th>\n",
       "      <td>5</td>\n",
       "      <td>19813</td>\n",
       "      <td>4481.0</td>\n",
       "      <td>6235.0</td>\n",
       "      <td>5339.0</td>\n",
       "      <td>7234.0</td>\n",
       "      <td>4065.0</td>\n",
       "      <td>5713.0</td>\n",
       "      <td>4890.0</td>\n",
       "      <td>6670.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72583</th>\n",
       "      <td>2</td>\n",
       "      <td>16833</td>\n",
       "      <td>4995.0</td>\n",
       "      <td>6030.0</td>\n",
       "      <td>5895.0</td>\n",
       "      <td>7012.0</td>\n",
       "      <td>5895.0</td>\n",
       "      <td>6860.0</td>\n",
       "      <td>6867.0</td>\n",
       "      <td>7909.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72602</th>\n",
       "      <td>3</td>\n",
       "      <td>4825</td>\n",
       "      <td>5498.0</td>\n",
       "      <td>6845.0</td>\n",
       "      <td>10038.0</td>\n",
       "      <td>12497.0</td>\n",
       "      <td>5487.0</td>\n",
       "      <td>6839.0</td>\n",
       "      <td>9220.0</td>\n",
       "      <td>10876.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>43 rows × 145 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       VehicleAge  VehOdo  MMRAcquisitionAuctionAveragePrice  \\\n",
       "147             2   12628                             5657.0   \n",
       "2876            1   13445                            19546.0   \n",
       "6111            1   19610                            14539.0   \n",
       "7564            4   14547                             3641.0   \n",
       "7977            1   10643                             6217.0   \n",
       "11084           5   17538                             2824.0   \n",
       "12094           6   15655                             4281.0   \n",
       "13092           2   10095                            32250.0   \n",
       "14246           8   19070                             3919.0   \n",
       "17491           6    5368                             2617.0   \n",
       "19000           5   19430                             4481.0   \n",
       "20035           3   15894                             4066.0   \n",
       "27551           3   15249                             5427.0   \n",
       "28299           5   14187                             3498.0   \n",
       "36221           7   18440                             2622.0   \n",
       "36699           3   16545                             4801.0   \n",
       "40461           2   18813                             5824.0   \n",
       "42083           1    9446                            27680.0   \n",
       "42734           4    9878                             3880.0   \n",
       "45365           2   19751                            10956.0   \n",
       "45583           2   18923                             5582.0   \n",
       "46703           5   14474                             4800.0   \n",
       "46803           1   17502                            35722.0   \n",
       "48345           5   19200                             4041.0   \n",
       "49287           5   18777                             2463.0   \n",
       "49610           2   12926                             6324.0   \n",
       "50021           2   11662                            19250.0   \n",
       "50367           3   15409                            16536.0   \n",
       "51978           2   15743                             5828.0   \n",
       "53834           2   13924                            32063.0   \n",
       "53979           5   19825                             2941.0   \n",
       "55574           5   16574                             4481.0   \n",
       "55789           7   19983                             2622.0   \n",
       "57292           6   19534                             2560.0   \n",
       "58356           1    8706                             8440.0   \n",
       "62083           1   19137                             9335.0   \n",
       "62436           7   14961                                0.0   \n",
       "62557           5   16713                             4481.0   \n",
       "64223           6   17281                             4281.0   \n",
       "68169           6   14519                             1683.0   \n",
       "71098           5   19813                             4481.0   \n",
       "72583           2   16833                             4995.0   \n",
       "72602           3    4825                             5498.0   \n",
       "\n",
       "       MMRAcquisitionAuctionCleanPrice  MMRAcquisitionRetailAveragePrice  \\\n",
       "147                             6350.0                            6610.0   \n",
       "2876                           20809.0                           23361.0   \n",
       "6111                           15841.0                           20008.0   \n",
       "7564                            4480.0                            4432.0   \n",
       "7977                            7325.0                            7214.0   \n",
       "11084                           4200.0                            3550.0   \n",
       "12094                           5752.0                            8606.0   \n",
       "13092                          35215.0                           35330.0   \n",
       "14246                           5867.0                            6134.0   \n",
       "17491                           3512.0                            3326.0   \n",
       "19000                           6235.0                            5339.0   \n",
       "20035                           4852.0                            4891.0   \n",
       "27551                           6508.0                            6361.0   \n",
       "28299                           4570.0                            4278.0   \n",
       "36221                           3740.0                            3332.0   \n",
       "36699                           5679.0                            5685.0   \n",
       "40461                           7192.0                            6790.0   \n",
       "42083                          30114.0                           31599.0   \n",
       "42734                           4795.0                            7030.0   \n",
       "45365                          12651.0                           12332.0   \n",
       "45583                           6339.0                            6529.0   \n",
       "46703                           5892.0                            5684.0   \n",
       "46803                          36859.0                           39080.0   \n",
       "48345                           4991.0                            7360.0   \n",
       "49287                           3276.0                            3160.0   \n",
       "49610                           7586.0                            7330.0   \n",
       "50021                          23021.0                           21290.0   \n",
       "50367                          18384.0                           20976.0   \n",
       "51978                           6720.0                            9332.0   \n",
       "53834                          35108.0                           37885.0   \n",
       "53979                           4054.0                            6188.0   \n",
       "55574                           6235.0                            5339.0   \n",
       "55789                           3740.0                            3332.0   \n",
       "57292                           3765.0                            3265.0   \n",
       "58356                           8841.0                           11948.0   \n",
       "62083                          10203.0                           10582.0   \n",
       "62436                              0.0                               0.0   \n",
       "62557                           6235.0                            5339.0   \n",
       "64223                           5752.0                            8606.0   \n",
       "68169                           2473.0                            2318.0   \n",
       "71098                           6235.0                            5339.0   \n",
       "72583                           6030.0                            5895.0   \n",
       "72602                           6845.0                           10038.0   \n",
       "\n",
       "       MMRAcquisitonRetailCleanPrice  MMRCurrentAuctionAveragePrice  \\\n",
       "147                           7358.0                         5617.0   \n",
       "2876                         24870.0                        20817.0   \n",
       "6111                         21662.0                        17343.0   \n",
       "7564                          5338.0                         4334.0   \n",
       "7977                          8411.0                         6740.0   \n",
       "11084                         5036.0                         3015.0   \n",
       "12094                         9835.0                         3953.0   \n",
       "13092                        38532.0                        32250.0   \n",
       "14246                         7433.0                         3828.0   \n",
       "17491                         4293.0                         2690.0   \n",
       "19000                         7234.0                         4481.0   \n",
       "20035                         5740.0                         4066.0   \n",
       "27551                         7529.0                         5540.0   \n",
       "28299                         5436.0                         2856.0   \n",
       "36221                         4539.0                         2622.0   \n",
       "36699                         6633.0                         4396.0   \n",
       "40461                         8267.0                         5824.0   \n",
       "42083                        33736.0                        27795.0   \n",
       "42734                         8244.0                         3136.0   \n",
       "45365                        14163.0                        11155.0   \n",
       "45583                         7346.0                         5667.0   \n",
       "46703                         6863.0                         4876.0   \n",
       "46803                        40308.0                        35722.0   \n",
       "48345                         8241.0                         3956.0   \n",
       "49287                         4038.0                         2994.0   \n",
       "49610                         8693.0                         6914.0   \n",
       "50021                        25363.0                            0.0   \n",
       "50367                        23011.0                        16988.0   \n",
       "51978                        10761.0                         5700.0   \n",
       "53834                        41482.0                        31127.0   \n",
       "53979                         8529.0                         3439.0   \n",
       "55574                         7234.0                         4481.0   \n",
       "55789                         4539.0                         2622.0   \n",
       "57292                         4566.0                         2560.0   \n",
       "58356                        12346.0                         8601.0   \n",
       "62083                        11519.0                         9335.0   \n",
       "62436                            0.0                            0.0   \n",
       "62557                         7234.0                         4481.0   \n",
       "64223                         9835.0                         4736.0   \n",
       "68169                         3171.0                         1683.0   \n",
       "71098                         7234.0                         4065.0   \n",
       "72583                         7012.0                         5895.0   \n",
       "72602                        12497.0                         5487.0   \n",
       "\n",
       "       MMRCurrentAuctionCleanPrice  MMRCurrentRetailAveragePrice  \\\n",
       "147                         6487.0                        6566.0   \n",
       "2876                       21601.0                       24286.0   \n",
       "6111                       18942.0                       20291.0   \n",
       "7564                        5392.0                        5181.0   \n",
       "7977                        7937.0                        7779.0   \n",
       "11084                       3879.0                        6214.0   \n",
       "12094                       5318.0                        7761.0   \n",
       "13092                      35215.0                       35330.0   \n",
       "14246                       5339.0                        6614.0   \n",
       "17491                       3375.0                        3405.0   \n",
       "19000                       6235.0                        5339.0   \n",
       "20035                       4852.0                        4891.0   \n",
       "27551                       6569.0                        6483.0   \n",
       "28299                       3976.0                        3584.0   \n",
       "36221                       3740.0                        3332.0   \n",
       "36699                       5208.0                        5248.0   \n",
       "40461                       7192.0                        6790.0   \n",
       "42083                      30136.0                       31128.0   \n",
       "42734                       4160.0                        6380.0   \n",
       "45365                      12897.0                       12547.0   \n",
       "45583                       6367.0                        6620.0   \n",
       "46703                       5856.0                        5766.0   \n",
       "46803                      36859.0                       39080.0   \n",
       "48345                       5032.0                        7592.0   \n",
       "49287                       3629.0                        3734.0   \n",
       "49610                       7776.0                        7967.0   \n",
       "50021                          0.0                           0.0   \n",
       "50367                      18762.0                       21431.0   \n",
       "51978                       6702.0                        9249.0   \n",
       "53834                      34798.0                       38151.0   \n",
       "53979                       4353.0                        6713.0   \n",
       "55574                       6235.0                        5339.0   \n",
       "55789                       3740.0                        3332.0   \n",
       "57292                       3765.0                        3265.0   \n",
       "58356                       9104.0                       11739.0   \n",
       "62083                      10203.0                       10582.0   \n",
       "62436                          0.0                           0.0   \n",
       "62557                       6235.0                        5339.0   \n",
       "64223                       5987.0                        8223.0   \n",
       "68169                       2473.0                        2318.0   \n",
       "71098                       5713.0                        4890.0   \n",
       "72583                       6860.0                        6867.0   \n",
       "72602                       6839.0                        9220.0   \n",
       "\n",
       "       MMRCurrentRetailCleanPrice  ...  VNST_PA  VNST_SC  VNST_TN  VNST_TX  \\\n",
       "147                        7506.0  ...        0        0        0        0   \n",
       "2876                      25060.0  ...        0        0        0        0   \n",
       "6111                      22079.0  ...        0        0        0        0   \n",
       "7564                       6323.0  ...        0        1        0        0   \n",
       "7977                       9072.0  ...        0        0        1        0   \n",
       "11084                      7007.0  ...        0        0        0        0   \n",
       "12094                      9057.0  ...        0        0        0        0   \n",
       "13092                     38532.0  ...        0        0        0        0   \n",
       "14246                      7872.0  ...        0        0        0        0   \n",
       "17491                      4145.0  ...        0        0        0        0   \n",
       "19000                      7234.0  ...        0        0        0        0   \n",
       "20035                      5740.0  ...        0        0        0        1   \n",
       "27551                      7595.0  ...        0        0        0        0   \n",
       "28299                      4794.0  ...        0        0        0        0   \n",
       "36221                      4539.0  ...        0        0        0        0   \n",
       "36699                      6125.0  ...        0        0        1        0   \n",
       "40461                      8267.0  ...        0        0        0        0   \n",
       "42083                     33014.0  ...        0        0        0        1   \n",
       "42734                      7925.0  ...        0        0        1        0   \n",
       "45365                     14429.0  ...        0        0        0        1   \n",
       "45583                      7376.0  ...        0        0        0        0   \n",
       "46703                      6824.0  ...        0        0        0        0   \n",
       "46803                     40308.0  ...        0        0        0        0   \n",
       "48345                      8498.0  ...        0        0        0        0   \n",
       "49287                      4419.0  ...        0        0        0        0   \n",
       "49610                      8898.0  ...        0        0        0        0   \n",
       "50021                         0.0  ...        0        0        0        0   \n",
       "50367                     23285.0  ...        0        0        0        0   \n",
       "51978                     10140.0  ...        0        0        0        0   \n",
       "53834                     41062.0  ...        0        0        0        0   \n",
       "53979                      7833.0  ...        0        0        0        0   \n",
       "55574                      7234.0  ...        0        0        0        0   \n",
       "55789                      4539.0  ...        0        0        0        0   \n",
       "57292                      4566.0  ...        0        0        0        0   \n",
       "58356                     12276.0  ...        0        0        0        1   \n",
       "62083                     11519.0  ...        0        0        0        0   \n",
       "62436                         0.0  ...        0        0        0        1   \n",
       "62557                      7234.0  ...        0        0        0        0   \n",
       "64223                      9902.0  ...        0        0        0        0   \n",
       "68169                      3171.0  ...        0        0        0        0   \n",
       "71098                      6670.0  ...        0        0        0        0   \n",
       "72583                      7909.0  ...        0        0        0        0   \n",
       "72602                     10876.0  ...        0        0        0        1   \n",
       "\n",
       "       VNST_UT  VNST_VA  VNST_WA  VNST_WI  VNST_WV  IsBadBuy  \n",
       "147          0        0        0        0        0         0  \n",
       "2876         0        1        0        0        0         1  \n",
       "6111         0        0        0        0        0         1  \n",
       "7564         0        0        0        0        0         0  \n",
       "7977         0        0        0        0        0         0  \n",
       "11084        0        0        0        0        0         0  \n",
       "12094        0        0        0        0        0         0  \n",
       "13092        0        0        0        0        0         1  \n",
       "14246        0        0        0        0        0         0  \n",
       "17491        0        0        0        0        0         0  \n",
       "19000        0        0        0        0        0         0  \n",
       "20035        0        0        0        0        0         0  \n",
       "27551        0        0        0        0        0         0  \n",
       "28299        0        0        0        0        0         0  \n",
       "36221        0        0        0        0        0         0  \n",
       "36699        0        0        0        0        0         1  \n",
       "40461        0        0        0        0        0         0  \n",
       "42083        0        0        0        0        0         1  \n",
       "42734        0        0        0        0        0         0  \n",
       "45365        0        0        0        0        0         0  \n",
       "45583        0        0        0        0        0         0  \n",
       "46703        0        0        0        0        0         0  \n",
       "46803        0        0        0        0        0         1  \n",
       "48345        0        0        0        0        0         0  \n",
       "49287        0        0        0        0        0         0  \n",
       "49610        0        0        0        0        0         0  \n",
       "50021        0        0        0        0        0         1  \n",
       "50367        0        0        0        0        0         1  \n",
       "51978        0        0        0        0        0         1  \n",
       "53834        0        0        0        0        0         1  \n",
       "53979        0        0        0        0        0         0  \n",
       "55574        0        0        0        0        0         0  \n",
       "55789        0        0        0        0        0         0  \n",
       "57292        0        0        0        0        0         0  \n",
       "58356        0        0        0        0        0         0  \n",
       "62083        0        0        0        0        0         0  \n",
       "62436        0        0        0        0        0         1  \n",
       "62557        0        0        0        0        0         0  \n",
       "64223        0        0        0        0        0         0  \n",
       "68169        0        0        0        0        0         0  \n",
       "71098        0        0        0        0        0         0  \n",
       "72583        0        0        0        0        0         0  \n",
       "72602        0        0        0        0        0         1  \n",
       "\n",
       "[43 rows x 145 columns]"
      ]
     },
     "execution_count": 154,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[data.VehOdo.between(1000,20000)] # This is also possible"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 166,
   "id": "19ab61f1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    VehicleAge  VehOdo\n",
      "0            2   75645\n",
      "1            4   84095\n",
      "2            3   51780\n",
      "3            6   73746\n",
      "4            7   81897\n",
      "..         ...     ...\n",
      "95           4   88650\n",
      "96           8   86059\n",
      "97           9   82417\n",
      "98           8   86206\n",
      "99           3   60225\n",
      "\n",
      "[100 rows x 2 columns]\n",
      "84095\n"
     ]
    }
   ],
   "source": [
    "#iloc：通过行号选取数据，即通过数据所在的自然行列数为选取数据\n",
    "#行号和索引有所差异，进行筛选后的数据行号会根据新的DataFrame变化，而索引不会发生变化\n",
    "print(data.iloc[0:100, 0:2])\n",
    "print(data.iloc[1,1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 161,
   "id": "c417fa19",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "84095\n",
      "84095\n"
     ]
    }
   ],
   "source": [
    "#at/iat：通过标签或行号获取某个数值的具体位置,不推荐使用，但at速度会快很多\n",
    "#获取第2行第3列数据\n",
    "print(data.iat[1,1])\n",
    "#获取第2行，VehOdo列位置的数据\n",
    "print(data.at[1,'VehOdo'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 173,
   "id": "abc61cbc",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5.426204904004408\n"
     ]
    }
   ],
   "source": [
    "from timeit import timeit \n",
    "\n",
    "def func():\n",
    "    for i in range(len(data)):\n",
    "        s=data.loc[i,'VehOdo']\n",
    "\n",
    "# timeit(函数名_字符串，运行环境_字符串，number=运行次数)\n",
    "t = timeit('func()', 'from __main__ import func', number=5)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 172,
   "id": "287b6ba9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "14.487212630003341\n"
     ]
    }
   ],
   "source": [
    "def func():\n",
    "    for i in range(len(data)):\n",
    "        s=data.iloc[i,1]\n",
    "\n",
    "# timeit(函数名_字符串，运行环境_字符串，number=运行次数)\n",
    "t = timeit('func()', 'from __main__ import func', number=5)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 168,
   "id": "f14b9bcf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.332775983995816\n"
     ]
    }
   ],
   "source": [
    "def func():\n",
    "    for i in range(len(data)):\n",
    "        s=data.at[i,'VehOdo']\n",
    "\n",
    "# timeit(函数名_字符串，运行环境_字符串，number=运行次数)\n",
    "t = timeit('func()', 'from __main__ import func', number=5)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 171,
   "id": "1bd48b43",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "11.918753186997492\n"
     ]
    }
   ],
   "source": [
    "def func():\n",
    "    for i in range(len(data)):\n",
    "        s=data.iat[i,1]\n",
    "\n",
    "# timeit(函数名_字符串，运行环境_字符串，number=运行次数)\n",
    "t = timeit('func()', 'from __main__ import func', number=5)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 170,
   "id": "17cd53b0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.8630749420044594\n"
     ]
    }
   ],
   "source": [
    "def func():\n",
    "    for i in range(len(data)):\n",
    "        s=data['VehOdo'][i]\n",
    "\n",
    "# timeit(函数名_字符串，运行环境_字符串，number=运行次数)\n",
    "t = timeit('func()', 'from __main__ import func', number=5)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd3ffede",
   "metadata": {},
   "source": [
    "Using maps for pandas.表达式map操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 174,
   "id": "5014fbad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0        1\n",
       "1        1\n",
       "2        1\n",
       "3        1\n",
       "4        1\n",
       "        ..\n",
       "72978    1\n",
       "72979    1\n",
       "72980    1\n",
       "72981    1\n",
       "72982    1\n",
       "Name: VehicleAge, Length: 72983, dtype: int64"
      ]
     },
     "execution_count": 174,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.iloc[:, 0].map(lambda x: 0 if x > 10 else 1)\n",
    "data.iloc[:, 0].apply(lambda x: 0 if x > 10 else 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4bed586c",
   "metadata": {},
   "source": [
    "groupby 分组汇总"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 176,
   "id": "19fedc39",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>IsBadBuy</th>\n",
       "      <th>VehicleAge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>4.940954</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   IsBadBuy  VehicleAge\n",
       "0         0    4.069461\n",
       "1         1    4.940954"
      ]
     },
     "execution_count": 176,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "aggregated = data[['VehicleAge', 'IsBadBuy']].groupby('IsBadBuy', as_index=False).agg('mean')\n",
    "aggregated"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21eb1985",
   "metadata": {},
   "source": [
    "merge拼接"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 177,
   "id": "73a9a532",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>VehicleAge_x</th>\n",
       "      <th>VehOdo</th>\n",
       "      <th>MMRAcquisitionAuctionAveragePrice</th>\n",
       "      <th>MMRAcquisitionAuctionCleanPrice</th>\n",
       "      <th>MMRAcquisitionRetailAveragePrice</th>\n",
       "      <th>MMRAcquisitonRetailCleanPrice</th>\n",
       "      <th>MMRCurrentAuctionAveragePrice</th>\n",
       "      <th>MMRCurrentAuctionCleanPrice</th>\n",
       "      <th>MMRCurrentRetailAveragePrice</th>\n",
       "      <th>MMRCurrentRetailCleanPrice</th>\n",
       "      <th>...</th>\n",
       "      <th>VNST_SC</th>\n",
       "      <th>VNST_TN</th>\n",
       "      <th>VNST_TX</th>\n",
       "      <th>VNST_UT</th>\n",
       "      <th>VNST_VA</th>\n",
       "      <th>VNST_WA</th>\n",
       "      <th>VNST_WI</th>\n",
       "      <th>VNST_WV</th>\n",
       "      <th>IsBadBuy</th>\n",
       "      <th>VehicleAge_y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>75645</td>\n",
       "      <td>8337.0</td>\n",
       "      <td>9645.0</td>\n",
       "      <td>12659.0</td>\n",
       "      <td>13621.0</td>\n",
       "      <td>8201.0</td>\n",
       "      <td>9792.0</td>\n",
       "      <td>12002.0</td>\n",
       "      <td>13607.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4</td>\n",
       "      <td>84095</td>\n",
       "      <td>8829.0</td>\n",
       "      <td>10387.0</td>\n",
       "      <td>10035.0</td>\n",
       "      <td>11718.0</td>\n",
       "      <td>8987.0</td>\n",
       "      <td>10565.0</td>\n",
       "      <td>10206.0</td>\n",
       "      <td>11910.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>51780</td>\n",
       "      <td>6003.0</td>\n",
       "      <td>6904.0</td>\n",
       "      <td>6983.0</td>\n",
       "      <td>7956.0</td>\n",
       "      <td>6517.0</td>\n",
       "      <td>7437.0</td>\n",
       "      <td>7538.0</td>\n",
       "      <td>8532.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6</td>\n",
       "      <td>73746</td>\n",
       "      <td>5438.0</td>\n",
       "      <td>6804.0</td>\n",
       "      <td>8608.0</td>\n",
       "      <td>9687.0</td>\n",
       "      <td>4086.0</td>\n",
       "      <td>5559.0</td>\n",
       "      <td>7708.0</td>\n",
       "      <td>8821.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7</td>\n",
       "      <td>81897</td>\n",
       "      <td>1911.0</td>\n",
       "      <td>2753.0</td>\n",
       "      <td>2564.0</td>\n",
       "      <td>3473.0</td>\n",
       "      <td>1367.0</td>\n",
       "      <td>2128.0</td>\n",
       "      <td>1976.0</td>\n",
       "      <td>2798.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4.069461</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 146 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   VehicleAge_x  VehOdo  MMRAcquisitionAuctionAveragePrice  \\\n",
       "0             2   75645                             8337.0   \n",
       "1             4   84095                             8829.0   \n",
       "2             3   51780                             6003.0   \n",
       "3             6   73746                             5438.0   \n",
       "4             7   81897                             1911.0   \n",
       "\n",
       "   MMRAcquisitionAuctionCleanPrice  MMRAcquisitionRetailAveragePrice  \\\n",
       "0                           9645.0                           12659.0   \n",
       "1                          10387.0                           10035.0   \n",
       "2                           6904.0                            6983.0   \n",
       "3                           6804.0                            8608.0   \n",
       "4                           2753.0                            2564.0   \n",
       "\n",
       "   MMRAcquisitonRetailCleanPrice  MMRCurrentAuctionAveragePrice  \\\n",
       "0                        13621.0                         8201.0   \n",
       "1                        11718.0                         8987.0   \n",
       "2                         7956.0                         6517.0   \n",
       "3                         9687.0                         4086.0   \n",
       "4                         3473.0                         1367.0   \n",
       "\n",
       "   MMRCurrentAuctionCleanPrice  MMRCurrentRetailAveragePrice  \\\n",
       "0                       9792.0                       12002.0   \n",
       "1                      10565.0                       10206.0   \n",
       "2                       7437.0                        7538.0   \n",
       "3                       5559.0                        7708.0   \n",
       "4                       2128.0                        1976.0   \n",
       "\n",
       "   MMRCurrentRetailCleanPrice  ...  VNST_SC  VNST_TN  VNST_TX  VNST_UT  \\\n",
       "0                     13607.0  ...        0        0        0        0   \n",
       "1                     11910.0  ...        0        0        1        0   \n",
       "2                      8532.0  ...        0        0        0        0   \n",
       "3                      8821.0  ...        0        0        0        0   \n",
       "4                      2798.0  ...        1        0        0        0   \n",
       "\n",
       "   VNST_VA  VNST_WA  VNST_WI  VNST_WV  IsBadBuy  VehicleAge_y  \n",
       "0        0        0        0        0         0      4.069461  \n",
       "1        0        0        0        0         0      4.069461  \n",
       "2        1        0        0        0         0      4.069461  \n",
       "3        0        0        0        0         0      4.069461  \n",
       "4        0        0        0        0         0      4.069461  \n",
       "\n",
       "[5 rows x 146 columns]"
      ]
     },
     "execution_count": 177,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merge_df = data.merge(aggregated, how='left', on='IsBadBuy')\n",
    "merge_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4a051a08",
   "metadata": {},
   "source": [
    "类似sql的操作: 运行很慢"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "id": "c7a8b25f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>IsBadBuy</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   IsBadBuy\n",
       "0         0\n",
       "1         0\n",
       "2         0\n",
       "3         0\n",
       "4         0"
      ]
     },
     "execution_count": 195,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pandasql import sqldf\n",
    "\n",
    "#注意全局变量要用两次\n",
    "pysqldf = lambda q: sqldf(q,globals())\n",
    "\n",
    "\n",
    "#删除列，有重复列会报错\n",
    "data_clean=data.drop(labels='Transmission_Manual',axis=1)\n",
    "\n",
    "q='SELECT IsBadBuy FROM data_clean LIMIT 10'\n",
    "\n",
    "names = pysqldf(q)\n",
    "names.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54a90179",
   "metadata": {},
   "source": [
    "describe操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 197,
   "id": "9ccbb5cb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1. count    72983.000000\n",
      "mean         4.176644\n",
      "std          1.712210\n",
      "min          0.000000\n",
      "25%          3.000000\n",
      "50%          4.000000\n",
      "75%          5.000000\n",
      "max          9.000000\n",
      "Name: VehicleAge, dtype: float64\n",
      "2. count    72983.000000\n",
      "mean         4.176644\n",
      "std          1.712210\n",
      "min          0.000000\n",
      "0%           0.000000\n",
      "10%          2.000000\n",
      "20%          3.000000\n",
      "30%          3.000000\n",
      "40%          4.000000\n",
      "50%          4.000000\n",
      "60%          4.000000\n",
      "70%          5.000000\n",
      "80%          6.000000\n",
      "90%          7.000000\n",
      "max          9.000000\n",
      "Name: VehicleAge, dtype: float64\n",
      "3.          VehicleAge                                                         \\\n",
      "              count      mean       std  min   0%  10%  20%  30%  40%  50%   \n",
      "IsBadBuy                                                                     \n",
      "0           64007.0  4.069461  1.677024  0.0  0.0  2.0  3.0  3.0  4.0  4.0   \n",
      "1            8976.0  4.940954  1.765302  1.0  1.0  3.0  3.0  4.0  4.0  5.0   \n",
      "\n",
      "          ... VNST_WV                                               \n",
      "          ...     10%  20%  30%  40%  50%  60%  70%  80%  90%  max  \n",
      "IsBadBuy  ...                                                       \n",
      "0         ...     0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  \n",
      "1         ...     0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  \n",
      "\n",
      "[2 rows x 2160 columns]\n"
     ]
    }
   ],
   "source": [
    "print('1.',data.VehicleAge.describe())\n",
    "print('2.',data.VehicleAge.describe(percentiles=np.arange(0, 1, 0.1)))\n",
    "print('3.',data.groupby('IsBadBuy').describe(percentiles=np.arange(0, 1, 0.1)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "213e1f60",
   "metadata": {},
   "source": [
    "### pandas.pivot_table  \n",
    "透视表"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "c37c5ddf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>foo</td>\n",
       "      <td>one</td>\n",
       "      <td>small</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>foo</td>\n",
       "      <td>one</td>\n",
       "      <td>large</td>\n",
       "      <td>2</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>foo</td>\n",
       "      <td>one</td>\n",
       "      <td>large</td>\n",
       "      <td>2</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>foo</td>\n",
       "      <td>two</td>\n",
       "      <td>small</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>foo</td>\n",
       "      <td>two</td>\n",
       "      <td>small</td>\n",
       "      <td>3</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     A    B      C  D  E\n",
       "0  foo  one  small  1  2\n",
       "1  foo  one  large  2  4\n",
       "2  foo  one  large  2  5\n",
       "3  foo  two  small  3  5\n",
       "4  foo  two  small  3  6"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df = pd.DataFrame({\"A\": [\"foo\", \"foo\", \"foo\", \"foo\", \"foo\",\n",
    "                         \"bar\", \"bar\", \"bar\", \"bar\"],\n",
    "                   \"B\": [\"one\", \"one\", \"one\", \"two\", \"two\",\n",
    "                         \"one\", \"one\", \"two\", \"two\"],\n",
    "                   \"C\": [\"small\", \"large\", \"large\", \"small\",\n",
    "                         \"small\", \"large\", \"small\", \"small\",\n",
    "                         \"large\"],\n",
    "                   \"D\": [1, 2, 2, 3, 3, 4, 5, 6, 7],\n",
    "                   \"E\": [2, 4, 5, 5, 6, 6, 8, 9, 9]})\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "7a9094ff",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>D</th>\n",
       "      <th colspan=\"3\" halign=\"left\">E</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>mean</th>\n",
       "      <th>min</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>A</th>\n",
       "      <th>C</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">bar</th>\n",
       "      <th>large</th>\n",
       "      <td>5.500000</td>\n",
       "      <td>9</td>\n",
       "      <td>7.500000</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>small</th>\n",
       "      <td>5.500000</td>\n",
       "      <td>9</td>\n",
       "      <td>8.500000</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">foo</th>\n",
       "      <th>large</th>\n",
       "      <td>2.000000</td>\n",
       "      <td>5</td>\n",
       "      <td>4.500000</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>small</th>\n",
       "      <td>2.333333</td>\n",
       "      <td>6</td>\n",
       "      <td>4.333333</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  D   E              \n",
       "               mean max      mean min\n",
       "A   C                                \n",
       "bar large  5.500000   9  7.500000   6\n",
       "    small  5.500000   9  8.500000   8\n",
       "foo large  2.000000   5  4.500000   4\n",
       "    small  2.333333   6  4.333333   2"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],\n",
    "                    aggfunc={'D': np.mean,\n",
    "                             'E': [min, max, np.mean]}, fill_value=0)\n",
    "table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "bb82dd8b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MultiIndex([('D', 'mean'),\n",
      "            ('E',  'max'),\n",
      "            ('E', 'mean'),\n",
      "            ('E',  'min')],\n",
      "           )\n",
      "Index(['A', 'C', 'D_mean', 'E_max', 'E_mean', 'E_min'], dtype='object')\n",
      "     A      C    D_mean  E_max    E_mean  E_min\n",
      "0  bar  large  5.500000      9  7.500000      6\n",
      "1  bar  small  5.500000      9  8.500000      8\n",
      "2  foo  large  2.000000      5  4.500000      4\n",
      "3  foo  small  2.333333      6  4.333333      2\n"
     ]
    }
   ],
   "source": [
    "#把多重索引columns拍平\n",
    "print(table.columns)\n",
    "table.columns =[s1 +'_'+ str(s2) for (s1,s2) in table.columns.tolist()]\n",
    "table.reset_index(inplace=True)\n",
    "print(table.columns)\n",
    "print(table.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "d72c117e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  col1  col2 col3\n",
      "0    a     2    c\n",
      "1    a     2    c\n",
      "2    a     2    c\n",
      "3    b     2    d\n",
      "4    b     2    d\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col2</th>\n",
       "      <th>hi</th>\n",
       "      <th>hello</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>col1</td>\n",
       "      <td>a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>col1</td>\n",
       "      <td>a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>col1</td>\n",
       "      <td>a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2</td>\n",
       "      <td>col1</td>\n",
       "      <td>b</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2</td>\n",
       "      <td>col1</td>\n",
       "      <td>b</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   col2    hi hello\n",
       "0     2  col1     a\n",
       "1     2  col1     a\n",
       "2     2  col1     a\n",
       "3     2  col1     b\n",
       "4     2  col1     b"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#melt函数,常用语宽表与窄表转换\n",
    "d = {'col1': ['a','a','a','b','b'], 'col2': [2,2,2,2,2],'col3':['c','c','c','d','d']}\n",
    "df = pd.DataFrame(data=d)\n",
    "print(df.head())\n",
    "pd.melt(df,id_vars=['col2'],value_vars=['col1'],var_name='hi',value_name='hello')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a58d5ed0",
   "metadata": {},
   "source": [
    "### 异常与缺失值处理\n",
    "1.  去掉含有缺失值的样本（行）\n",
    "2.  将含有缺失值的列（特征向量）去掉\n",
    "3.  将缺失值用某些值填充（0，平均值，中值等）  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 223,
   "id": "550030be",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       name        toy       born\n",
      "0    Alfred        NaN        NaT\n",
      "1    Batman  Batmobile 1940-04-25\n",
      "2  Catwoman   Bullwhip        NaT\n",
      "     name        toy       born\n",
      "1  Batman  Batmobile 1940-04-25\n",
      "       name\n",
      "0    Alfred\n",
      "1    Batman\n",
      "2  Catwoman\n",
      "       name        toy       born\n",
      "0    Alfred        NaN        NaT\n",
      "1    Batman  Batmobile 1940-04-25\n",
      "2  Catwoman   Bullwhip        NaT\n",
      "       name        toy       born\n",
      "1    Batman  Batmobile 1940-04-25\n",
      "2  Catwoman   Bullwhip        NaT\n",
      "     name        toy       born\n",
      "1  Batman  Batmobile 1940-04-25\n",
      "       name\n",
      "0    Alfred\n",
      "1    Batman\n",
      "2  Catwoman\n"
     ]
    }
   ],
   "source": [
    "#异常值删除\n",
    "#https://blog.csdn.net/dss_dssssd/article/details/82814673\n",
    "\n",
    "import pandas as pd\n",
    "df = pd.DataFrame({\"name\": ['Alfred', 'Batman', 'Catwoman'],\n",
    "                   \"toy\": [np.nan, 'Batmobile', 'Bullwhip'],\n",
    "                   \"born\": [pd.NaT, pd.Timestamp(\"1940-04-25\"),pd.NaT]})\n",
    "print(df)\n",
    "# Drop the rows where at least one element is missing.\n",
    "print(df.dropna())\n",
    "\n",
    "# Drop the columns where at least one element is missing.\n",
    "print(df.dropna(axis='columns'))\n",
    "\n",
    "# Drop the rows where all elements are missing.\n",
    "print(df.dropna(how='all'))\n",
    "\n",
    "# Keep only the rows with at least 2 non-NA values.\n",
    "print(df.dropna(thresh=2))\n",
    "\n",
    "# Define in which columns to look for missing values.\n",
    "print(df.dropna(subset=['name', 'born']))\n",
    "\n",
    "#删除列\n",
    "print(df.drop(['toy', 'born'], axis=1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 222,
   "id": "ff660756",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       name        toy                 born\n",
      "0    Alfred          0                    0\n",
      "1    Batman  Batmobile  1940-04-25 00:00:00\n",
      "2  Catwoman   Bullwhip                    0\n"
     ]
    }
   ],
   "source": [
    "#简单替换\n",
    "print(df.fillna(0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 215,
   "id": "9d584746",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     A    B   C  D\n",
      "0  NaN  2.0 NaN  0\n",
      "1  3.0  4.0 NaN  1\n",
      "2  NaN  NaN NaN  5\n",
      "3  NaN  3.0 NaN  4\n",
      "     A    B    C  D\n",
      "0  3.0  2.0  0.0  0\n",
      "1  3.0  4.0  0.0  1\n",
      "2  3.0  3.0  0.0  5\n",
      "3  3.0  3.0  0.0  4\n"
     ]
    }
   ],
   "source": [
    "#使用sklearn\n",
    "#https://zhuanlan.zhihu.com/p/115103738\n",
    "from sklearn.impute import SimpleImputer\n",
    "df = pd.DataFrame([[np.nan, 2, np.nan, 0],\n",
    "                  [3, 4, np.nan, 1],\n",
    "                 [np.nan, np.nan, np.nan, 5],\n",
    "                [np.nan, 3, np.nan, 4]],\n",
    "                 columns=list('ABCD'))\n",
    "print(df)\n",
    "\n",
    "\n",
    "#均值：\n",
    "df_mean = SimpleImputer(missing_values=np.nan, strategy='mean',copy=False)\n",
    "\n",
    "#中位数：\n",
    "df_median = SimpleImputer(missing_values=np.nan, strategy='median',copy=False)\n",
    "\n",
    "#常数0：\n",
    "df_0 = SimpleImputer(strategy=\"constant\",fill_value=0,copy=False)\n",
    "\n",
    "#众数：\n",
    "df_most_frequent = SimpleImputer(missing_values=np.nan, strategy='most_frequent',copy=False)\n",
    "\n",
    "#不同列采用不同策略\n",
    "#A列\n",
    "df_A = df.loc[:,'A'].values.reshape(-1,1)#转换成向量\n",
    "df.loc[:,'A']=df_mean.fit_transform(df_A)\n",
    "\n",
    "#B列\n",
    "df_B = df.loc[:,'B'].values.reshape(-1,1)\n",
    "df.loc[:,'B']=df_median.fit_transform(df_B)\n",
    "\n",
    "#C列\n",
    "df_C = df.loc[:,'C'].values.reshape(-1,1)\n",
    "df.loc[:,'C']=df_0.fit_transform(df_C)\n",
    "\n",
    "#D列\n",
    "df_D = df.loc[:,'D'].values.reshape(-1,1)\n",
    "df.loc[:,'D']=df_most_frequent.fit_transform(df_D)\n",
    "print(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f32cdb33",
   "metadata": {},
   "source": [
    "[pandas行转列、列转行、以及一行生成多行](https://www.cnblogs.com/traditional/p/11967360.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "4bbbc8a9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     A    B      C  D  E\n",
      "0  foo  one  small  1  2\n",
      "1  foo  one  large  2  4\n",
      "2  foo  one  large  2  5\n",
      "3  foo  two  small  3  5\n",
      "4  foo  two  small  3  6\n",
      "     A                          B                                    C  \\\n",
      "0  bar       [one, one, two, two]         [large, small, small, large]   \n",
      "1  foo  [one, one, one, two, two]  [small, large, large, small, small]   \n",
      "\n",
      "                 D                E  \n",
      "0     [4, 5, 6, 7]     [6, 8, 9, 9]  \n",
      "1  [1, 2, 2, 3, 3]  [2, 4, 5, 5, 6]  \n"
     ]
    }
   ],
   "source": [
    "#列表生成序列\n",
    "import pandas as pd\n",
    "df = pd.DataFrame({\"A\": [\"foo\", \"foo\", \"foo\", \"foo\", \"foo\",\n",
    "                         \"bar\", \"bar\", \"bar\", \"bar\"],\n",
    "                   \"B\": [\"one\", \"one\", \"one\", \"two\", \"two\",\n",
    "                         \"one\", \"one\", \"two\", \"two\"],\n",
    "                   \"C\": [\"small\", \"large\", \"large\", \"small\",\n",
    "                         \"small\", \"large\", \"small\", \"small\",\n",
    "                         \"large\"],\n",
    "                   \"D\": [1, 2, 2, 3, 3, 4, 5, 6, 7],\n",
    "                   \"E\": [2, 4, 5, 5, 6, 6, 8, 9, 9]})\n",
    "print(df.head())\n",
    "x_df=df.groupby(['A'],as_index=False).agg(lambda x: list(x))\n",
    "print(x_df.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fba8f77e",
   "metadata": {},
   "source": [
    "## Matplotlib\n",
    "---\n",
    "作图"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 198,
   "id": "bdc25722",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfCklEQVR4nO3dd3hVZbr+8e9D7zVBAkkMgoBUlYSmjm1U7A2xDDb0MDi9iTiOR8+MjmXKceacUYZRxq5Uu4xl7DMKCSChC9J2aAmEEggkJPv5/ZF4fkwGTNk72Xvt3J/r4nKXhet5WcmdlbXf513m7oiISPA0iXUBIiJSNwpwEZGAUoCLiASUAlxEJKAU4CIiAdWsIXeWlJTkGRkZDblLEZHAW7hw4Q53T676eoMGeEZGBjk5OQ25SxGRwDOzjUd6XZdQREQCSgEuIhJQCnARkYBSgIuIBJQCXEQkoKoNcDObbmb5ZrasyuvfN7NVZrbczB6uvxJFRORIanIG/iQw5vAXzOxM4FJgqLsPBH4b/dJEROTrVBvg7v4RUFjl5duAB929pHKb/HqoTUQk8PaXlHHvq8vZe/BQ1P/fdb0G3hc4zczmm9mHZpZ1tA3NbKKZ5ZhZTkFBQR13JyISPDv3lXDdXz7jmc82krOh6nlw5Ooa4M2ALsBI4HZgppnZkTZ092nununumcnJ/9YJKiKSkEKFxVw19VNWbSviz+OHcVb/Y6K+j7q20ucBc73idj4LzCwMJAE6xRaRRm/Vtr3cOH0BB0rLee7WEWRmdKmX/dT1DPxl4EwAM+sLtAB2RKkmEZHAWrC+kKumfgrArEmj6y28oQZn4Gb2AnAGkGRmecA9wHRgeuXUwlLgRtfNNUWkkXtnxXa+9/wienZuzdMThpPauU297q/aAHf3a4/y1vgo1yIiElgzsjdx59ylDE7txF9vyqJL2xb1vs8GXU5WRCTRuDuPfvAlv3lrNd/om8xj3zqZti0bJloV4CIidRQOO796YwV//ccGLj2xB78ZO5QWzRpuhRIFuIhIHZSWhfnZrCW8umQLE07pxS8uPIEmTY44m7reKMBFRGppf0kZk55dyMdrdnDHmP5MOv04jtIKU68U4CIitbBzXwkTnsxm2Za9PDx2COMy02JWiwJcRKSG8nYVc8MTC9i8+wB/Hj+Mbw6IfndlbSjARURqoKG6K2tDAS4iUo3sDYXc8mQ2rVs0Zdak0fTr3j7WJQEKcBGRr9XQ3ZW1oQAXETmKmdkhpszNbdDuytpQgIuIVBHL7sraiL+KRERiKNbdlbWhABcRqVRaFub22Ut45fPYdVfWhgJcRIR/7a6cPKYft53eOybdlbWhABeRRu+r7sqlm/fw8JVDGJcVu+7K2lCAi0ij9i/dlddnck6MuytrQwEuIo3W4d2Vz946gqw46K6sjWo/WjWz6WaWX3n7tKrv/dTM3MyS6qc8EZH6kb2hkHGV966cOWlU4MIbanZT4yeBMVVfNLM04FxgU5RrEhGpV++s2M74x+eT1L4lc24bTf/uHWJdUp1UG+Du/hFQeIS3/huYDOhmxiISGDOzQ0x6diH9u7dn9qTRcdUaX1t1ugZuZpcCm919SbxPsxERgYruysc+/JKH/7aa045PYur4YXHZXVkbta7ezNoAP6fi8klNtp8ITARIT0+v7e5ERCIWDjv3vbGS6f9YzyVDe/Dbq+K3u7I26jKC3kAvYImZbQBSgUVm1v1IG7v7NHfPdPfM5OTkulcqIlIHpWVhfjzzc6b/Yz03n5LBI1efmBDhDXU4A3f3pUC3r55Xhnimu++IYl0iIhHbX1LGbc8t4qMvCgLTXVkbNZlG+ALwKdDPzPLM7Jb6L0tEJDKF+0u57vH5fLKmgIevHMJ3zuiTUOENNTgDd/drq3k/I2rViIhEQd6uYm6YvoDNu4LXXVkbwf4IVkSkitXbirhh+nwOlJbzzC0jGN4reA06NaUAF5GEcfi9K2dOGhXYBp2aUoCLSEJ4d8V2vvv8Inp2as1TE4aT1iW4DTo1pQAXkcCbmRPizrlLGdSjA9NvyqJru5axLqlBKMBFJLASsbuyNhrPSEUkoSRqd2VtKMBFJHAOv3flzadkcPeFA+L63pX1RQEuIoGS6N2VtaEAF5HAKNxfys1PZrM0bzcPXTmYq7Ma9wJ5CnARCYTG0l1ZGwpwEYl7X3VXFjeC7sraUICLSFz7qruyVfOmzGoE3ZW1oQAXkbjVGLsra0MBLiJx6avuyoE9OvDXRtRdWRsKcBGJK+7O1A/X8dDfVjXK7sra0L+KiMSNcNi5/82VPPFJ4+2urA0FuIjEhdKyMJNnL+Hlz7dw0+gM/vOixtldWRsKcBGJucO7K28/rx/fOaPxdlfWRk3uiTndzPLNbNlhr/3GzFaZWa6ZvWRmneq1ShFJWIffu/KhKwfz3TMT796V9aUmF5eeBMZUee0dYJC7DwG+AO6Mcl0i0gjk7Spm7NR/smrrXqaOH9boW+Nrq9oAd/ePgMIqr73t7mWVTz8DUuuhNhFJYKu3FTH2sU8pKCrhmVtGcO7A7rEuKXCi8fHuBGDe0d40s4lmlmNmOQUFBVHYnYgEXc6GQq6a+k/C7syaNEqt8XUUUYCb2V1AGfDc0bZx92nununumcnJyZHsTkQSwLsrtvOtx+eT1K4lc24brdb4CNR5FoqZ3QRcBJzt7h61ikQkYam7MrrqFOBmNgaYDJzu7sXRLUlEEk3V7srHxg+jnborI1btv6CZvQCcASSZWR5wDxWzTloC71RO9/nM3SfVY50iElCHd1dePLQHv1N3ZdRUG+Dufu0RXn6iHmoRkQSj7sr6pd9hRKReqLuy/inARSTqDr935YNXDOaa4WrQqQ8KcBGJqsPvXTl1/DA16NQjBbiIRM0X24u44YkF7C8t070rG4ACXESiImdDIRMq710589ujOCFFDTr1TQEuIhH7+8rtfOe5RfTo1Jqnde/KBqMAF5GIzMoJMUXdlTGhABeROnF3/vzROh6ct4pT+yQx9Xp1VzY0/WuLSK2Fw86v31zJ4+qujCkFuIjUyqHyMJNn5/LS4s3qrowxBbiI1FhxaRm3PbuID9VdGRcU4CJSI+qujD8KcBGp1ubdB7j+ifnk7TrAY+OHcZ66K+OCAlxEvta/dFdOGM6I47rGuiSppAAXkaNauLGQCU/m0KJZE3VXxiEFuIgc0d9Xbue7zy8ipaO6K+OVAlxE/s1X3ZUDUjrw15uzSFJ3ZVyqdua9mU03s3wzW3bYa13M7B0zW1P53871W6aINISKe1d+ye2zcxl1XFdemDhS4R3HatI69SQwpsprU4C/u/vxwN8rn4tIgIXDzv1vrOTBeau4aEgK02/KUmt8nKs2wN39I6CwysuXAk9VPn4KuCy6ZYlIQzpUHuans5bw+CfruWl0Bn+85iS1xgdAXX+8HuPuWysfbwOOOdqGZjYRmAiQnq6J/yLx5vDuyp+d25fvntlH3ZUBEfGPWHd3wL/m/WnununumcnJyZHuTkSiaNf+Uq77y3w+XlPAA1cM5ntnHa/wDpC6noFvN7MUd99qZilAfjSLEpH6t3n3AW54Yj4hdVcGVl3PwF8Fbqx8fCPwSnTKEZGG8MX2Iq589J/kF5XwzIThCu+Aqsk0wheAT4F+ZpZnZrcADwLnmNka4JuVz0UkABZuLOSqqZ9S7s7Mb49Sa3yAVXsJxd2vPcpbZ0e5FhGpZ++tqrh3ZfcOrXjmlhHqrgw4TfIUaSRmL8zjjjm56q5MIApwkUbgzx9+yQO6d2XC0VEUSWDhsPPAvJX85eP1XDQkhd+NG0rLZk1jXZZEiQJcJEEdfu/KG0cdyz0XD9S9KxOMAlwkAam7snFQgIskmF2V967MzdvNA1cM5lrduzJhKcBFEsjh3ZWPfmsYYwapQSeRKcBFEoC7Mysnj1+9sQIcnp4wnJFq0El4CnCRgAsVFnPn3KV8snYHw3t14aErh9ArqW2sy5IGoAAXCajysPPUPzfwm7dW07SJcd9lg7hueLpmmjQiCnCRAFqzvYjJc3JZvGk3Z/RL5teXD6ZHp9axLksamAJcJEBKy8JM/fBL/ve9tbRt2ZRHrj6RS0/soSmCjZQCXCQgcvN2M3l2Lqu2FXHx0B7cc/EArWfSyCnAReLcwUPl/Pc7X/CXj9eR3L4lf7khk3MGHPUuhtKIKMBF4thn63YyZU4uG3YWc+3wNKacfwIdWzePdVkSJxTgInGo6OAhHpy3iufmbyK9Sxuev3UEo/skxbosiTMKcJE4896q7dz10jK27z3Iraf24ifn9qVNC32ryr/TV4VInCjcX8ovX1vOy59v4fhu7Xj0ttGclN451mVJHIsowM3sx8CtgANLgZvd/WA0ChNpLNyd13K3cu+ryyk6eIgfnn083zmzt9btlmrVOcDNrCfwA2CAux8ws5nANcCTUapNJOFt23OQX7y8jHdXbmdoakceGjuC/t07xLosCYhIL6E0A1qb2SGgDbAl8pJEEp+782J2iF+/sZJD4TB3XXACE07tRVO1wUst1DnA3X2zmf0W2AQcAN5297erbmdmE4GJAOnpWpdYZOPO/UyZs5RP1+1k5HFdePCKIWRo8SmpgyZ1/Ytm1hm4FOgF9ADamtn4qtu5+zR3z3T3zOTk5LpXKhJw5WHn8Y/Xcd4jH7Fs8x5+fflgnr91pMJb6iySSyjfBNa7ewGAmc0FRgPPRqMwkUSyelvF4lNLQrs5u3837rt8ECkdtfiURCaSAN8EjDSzNlRcQjkbyIlKVSIJorQszKMfrOVP76+lfavm/PHak7h4SIoWn5KoiOQa+Hwzmw0sAsqAxcC0aBUmEnSfh3Zzx+xcVm8v4tITe3DPxQPp0rZFrMuSBBLRLBR3vwe4J0q1iCSEA6Xl/O7t1Uz/x3q6tW/FEzdmcvYJWnxKok+dmCJR9M8vdzBlzlI2FRZz3Yh0ppzfnw6ttPiU1A8FuEgU7D14iAfeXMkLC0Ic27UNL/zHSEb11k2FpX4pwEUi9O6K7dz18lIKikqY+I3j+PE3+9K6hdrgpf4pwEXqaOe+Eu59bQWvLdlC/+7tmXZ9JkPTOsW6LGlEFOAiteTuvLpkC/e+upx9JWX85Jy+TDq9Ny2a1bkvTqROFOAitbBl9wF+8fIy3luVz4lpnXh47BD6HtM+1mVJI6UAF6mBcNh5fsEmHpy3ivKwc/dFA7hpdIYWn5KYUoCLVGP9jv1MmZPL/PWFnNKnKw9cPoT0rm1iXZaIAlzkaMrKwzzxyXp+/84XtGjWhIeuHMy4zDS1wUvcUICLHMHKrXu5Y04uuXl7OGfAMdx32SCO6dAq1mWJ/AsFuMhhSsrK+dN7a3n0gy/p1KY5f7ruZC4Y3F1n3RKXFOAilRZu3MUdc3JZm7+PK07qyd0XDaCzFp+SOKYAl0avuLSM37y1mif/uYGUDq34681ZnNmvW6zLEqmWAlwatU/W7GDK3Fzydh3g+pHHMnlMP9pr8SkJCAW4NEp7Dhzi/jdWMDMnj15JbZkxcSQjjtPiUxIsCnBpdN5avo27X17Gzv2lTDq9Nz/65vG0aq7FpyR4FODSaBQUlXDvq8t5Y+lWTkjpwBM3ZjE4tWOsyxKps4gC3Mw6AY8DgwAHJrj7p1GoSyRq3J2XFm/ml6+voLiknNvP68fEbxxH86ZafEqCLdIz8D8Af3P3sWbWAlB/scSVzbsP8PO5S/nwiwJOTq9YfKpPNy0+JYmhzgFuZh2BbwA3Abh7KVAanbJEIhMOO8/O38hD81bhwL0XD+D6UVp8ShJLJGfgvYAC4K9mNhRYCPzQ3fcfvpGZTQQmAqSnp0ewO5Ga+bJgH1Pm5JK9YRenHZ/Ery8fTFoX/XIoiSeSi4DNgJOBx9z9JGA/MKXqRu4+zd0z3T0zOTk5gt2JfL2y8jCPfrCW8//wMau3FfGbsUN4esJwhbckrEjOwPOAPHefX/l8NkcIcJGGsHzLHu6Yk8uyzXsZM7A7v7xsIN3aa/EpSWx1DnB332ZmITPr5+6rgbOBFdErTaR6Bw+V8z/vrWHqh+vo3KYFj33rZM4fnBLrskQaRKSzUL4PPFc5A2UdcHPkJYnUTM6GQibPyWVdwX6uPDmVuy86gU5ttPiUNB4RBbi7fw5kRqcUkZrZX1Kx+NRTn26gR8fWPDVhOKf31ecr0vioE1MC5aMvCrhz7lK27DnAjaMy+Nl5/WjXUl/G0jjpK18CYXdxKfe9sZLZC/M4Lrkts749isyMLrEuSySmFOAS9+Yt3crdryxnV3Ep3z2zN98/S4tPiYACXOJY/t6D/Ocry/nb8m0M7NGBpyZkMbCHFp8S+YoCXOKOuzN7YR6/en0FB8vCTB7Tj/84TYtPiVSlAJe4Eios5ucvLeXjNTvIyujMg1cOoXdyu1iXJRKXFOASF8Jh5+lPN/DwW6sx4FeXDuRbI46liRafEjkqBbjE3Nr8Iu6Ys5SFG3dxet9k7r98EKmdtX6JSHUU4BIzh8rDTPtoHX94dw1tWjbl9+OGcvlJPTHTWbdITSjAJSaWbd7D7bNzWbl1LxcOTuHeSwaS3L5lrMsSCRQFuDSog4fKeeTdNfzl43V0aduCqeOHMWZQ91iXJRJICnBpMAvWFzJlTi7rduxnXGYqd10wgI5tmse6LJHAUoBLvdtXUsZD81bxzGcbSe3cmmdvGcGpxyfFuiyRwFOAS716f3U+d81dyta9B5lwSi9+dl5f2rTQl51INOg7SerFrv2l/Or1FcxdvJk+3doxe9Johh3bOdZliSQUBbhElbvzxtKt3PPKcvYcOMQPzurDd8/qQ8tmWnxKJNoU4BI12/ce5O6Xl/H2iu0M7tmRZ24ZwYAeHWJdlkjCijjAzawpkANsdveLIi9JgsbdmZkT4r43VlJaFubO8/tzy6m9aKbFp0TqVTTOwH8IrAR0qtUIbdpZzJ0v5fKPtTsZ3qsLD105hF5JbWNdlkijEFGAm1kqcCFwP/CTqFQkgVBcWsbz8zfxu7e/oGkT477LBnHd8HQtPiXSgCI9A38EmAy0P9oGZjYRmAiQnp4e4e4kltydJXl7mJEd4rUlW9hXUsaZ/ZK5//LB9OjUOtbliTQ6dQ5wM7sIyHf3hWZ2xtG2c/dpwDSAzMxMr+v+JHZ2F5fy0uLNzMgOsWpbEa2aN+HCwT24OiuNrIzOWnxKJEYiOQM/BbjEzC4AWgEdzOxZdx8fndIklsJh59N1O3kxO8Rby7dRWhZmSGpH7r98EBcP7UGHVmqBF4m1Oge4u98J3AlQeQb+M4V38G3bc5DZC0PMyAkRKjxAh1bNuG54OuMy0zQlUCTOaB64cKg8zHur8pmRHeKD1fmEHUb37srPzu3HeQO76w7wInEqKgHu7h8AH0Tj/yUNZ13BPmbkhJizcDM79pXQrX1LbjujN+My0zi2q6YCisQ7nYE3MgdKy5m3bCsvZodYsL6Qpk2Ms/p345qsNE7vm6zmG5EAUYA3Ess27+HF7E28sngLRSVlHNu1DZPH9GPsyal069Aq1uWJSB0owBPYnuJDvLKkYvrf8i17admsCRcMTmFcZhojj+ui6X8iAacATzDuzvz1hczIDvHm0q2UlIUZkNKBX146kEuH9tQdcEQSiAI8QeTvPcjsRXnMzA6xYWcx7Vs246rMVK7JSmdQz46xLk9E6oECPMDKysN8sLqAGTkh3luVT3nYGd6rC98/63guGJxC6xaa/ieSyBTgAbRx535m5oSYlZNHflEJSe1acOtpvRiXmUbv5HaxLk9EGogCPCAOHirnreXbeHFBiE/X7aSJwRn9unF1Vhpn9e9Gc03/E2l0FOBxbsWWvczMCfHS4s3sOXCItC6t+ek5fRmbmUpKR60AKNKYKcDjUNHBQ7y6ZAszskPk5u2hRdMmnDeoO9dkpTHquK5ac1tEAAV43HB3cjbu4sUFFdP/Dhwqp98x7bnn4gFcdmJPOrdtEesSRSTOKMBjbMe+EuYuyuPF7BDrCvbTtkVTLjupB1dnpTM0taOabUTkqBTgMVAedj5aU8CMBSHeXbmdsrAz7NjOPDy2NxcOTqFtSx0WEamekqIBhQqLmZUTYtbCPLbuOUiXti24+ZQMrs5Ko0+3o96VTkTkiBTg9aykrJx3VmxnRnaIT9buAOC045O5+6IBfPOEY2jRTNP/RKRuFOD1ZPW2ImZkh3hpcR67ig/Rs1Nrfnj28VyVmUZP3QBYRKJAAR5F+0rKeH3JFmbkhFi8aTfNmxrnDujOuKw0Tu2TRFNN/xORKFKAR8jdWRzazYwFIV7L3UJxaTl9urXjFxeewOUn9aRru5axLlFEElSdA9zM0oCngWMAB6a5+x+iVVi8K9xfytxFeczIDrEmfx+tmzfl4qEpXJ2VzsnpnTT9T0TqXSRn4GXAT919kZm1Bxaa2TvuviJKtcWdcNj5ZO0OZuSEeHv5Ng6VOyemdeKBKwZz0ZAU2rfSWtsi0nDqHODuvhXYWvm4yMxWAj2BhAvwLbsPMCsnj5k5ITbvPkCnNs0ZP/JYrs5Ko3/3DrEuT0QaqahcAzezDOAkYP4R3psITARIT0+Pxu4aRGlZmL+v3M6L2SE+WlOAO5zaJ4kp5/fnnAHH0Kq51toWkdiKOMDNrB0wB/iRu++t+r67TwOmAWRmZnqk+6tva/P3MTMnxJyFeezcX0r3Dq343pl9GJeZRlqXNrEuT0Tk/0QU4GbWnIrwfs7d50anpIZXXFrGG7lbmZEdImfjLpo1Mc4+oRvXZKXzjb7Jmv4nInEpklkoBjwBrHT330evpIbh7uTm7eHF7BCvLdnCvpIyjktqy53n9+eKk1NJbq/pfyIS3yI5Az8FuB5YamafV772c3d/M+Kq6tHu4lJeXryZF7NDrNpWRKvmTbhgcArXZKWTldFZ0/9EJDAimYXyCRCItAuHnc/W7eTF7BB/W76N0rIwg3t25L7LBnHJiT3ooOl/IhJACd2JuW3PQWYvDDEzJ49NhcV0aNWMa7PSGJeVxsAeHWNdnohIRBIuwA+Vh3l/VT4zskO8vzqfsMPI47rwk3P6MmZQd03/E5GEkTABvn7HfmZkh5i9MI8d+0pIbt+SSaf3ZlxmGhlJbWNdnohI1AU6wA+UljNvWcX0v/nrC2naxDizXzeuzkrjzH7JNGuqtbZFJHEFMsCXbd7DjOwQL3++maKDZRzbtQ23n9ePscNSOaZDq1iXJyLSIAIT4HsOHOLVzyum/y3fspcWzZpwwaDuXJ2VzoheXWiiZhsRaWQCEeB//Psa/vT+WkrKwpyQ0oH/umQgl53Yk45tNP1PRBqvQAR4j06tGTsslWuy0hnUs4OabURECEiAjx2WythhqbEuQ0QkrmiahohIQCnARUQCSgEuIhJQCnARkYBSgIuIBJQCXEQkoBTgIiIBpQAXEQkoc2+4G8WbWQGwsY5/PQnYEcVyYkljiT+JMg7QWOJVJGM51t2Tq77YoAEeCTPLcffMWNcRDRpL/EmUcYDGEq/qYyy6hCIiElAKcBGRgApSgE+LdQFRpLHEn0QZB2gs8SrqYwnMNXAREflXQToDFxGRwyjARUQCKq4C3Mymm1m+mS07yvtmZn80s7VmlmtmJzd0jTVVg7GcYWZ7zOzzyj//2dA11oSZpZnZ+2a2wsyWm9kPj7BNII5LDccSlOPSyswWmNmSyrH81xG2aWlmMyqPy3wzy4hBqdWq4VhuMrOCw47LrbGotSbMrKmZLTaz14/wXnSPibvHzR/gG8DJwLKjvH8BMA8wYCQwP9Y1RzCWM4DXY11nDcaRApxc+bg98AUwIIjHpYZjCcpxMaBd5ePmwHxgZJVtvgNMrXx8DTAj1nVHMJabgP+Nda01HM9PgOeP9HUU7WMSV2fg7v4RUPg1m1wKPO0VPgM6mVlKw1RXOzUYSyC4+1Z3X1T5uAhYCfSsslkgjksNxxIIlf/W+yqfNq/8U3VGwqXAU5WPZwNnWxzeULaGYwkEM0sFLgQeP8omUT0mcRXgNdATCB32PI+AfgNWGlX5a+M8MxsY62KqU/nr3klUnCEdLnDH5WvGAgE5LpW/qn8O5APvuPtRj4u7lwF7gK4NWmQN1WAsAFdWXqKbbWZpDVthjT0CTAbCR3k/qsckaAGeSBZRsb7BUOB/gJdjW87XM7N2wBzgR+6+N9b1RKKasQTmuLh7ubufCKQCw81sUIxLqrMajOU1IMPdhwDv8P/PYuOGmV0E5Lv7wobaZ9ACfDNw+E/e1MrXAsfd9371a6O7vwk0N7OkGJd1RGbWnIrAe87d5x5hk8Acl+rGEqTj8hV33w28D4yp8tb/HRczawZ0BHY2aHG1dLSxuPtOdy+pfPo4MKyBS6uJU4BLzGwD8CJwlpk9W2WbqB6ToAX4q8ANlbMeRgJ73H1rrIuqCzPr/tW1LzMbTsWxiLtvrsoanwBWuvvvj7JZII5LTcYSoOOSbGadKh+3Bs4BVlXZ7FXgxsrHY4H3vPLTs3hSk7FU+UzlEio+v4gr7n6nu6e6ewYVH1C+5+7jq2wW1WPSrK5/sT6Y2QtUzAJIMrM84B4qPtDA3acCb1Ix42EtUAzcHJtKq1eDsYwFbjOzMuAAcE08fnNRcVZxPbC08holwM+BdAjccanJWIJyXFKAp8ysKRU/ZGa6++tm9ksgx91fpeKH1TNmtpaKD9SviV25X6smY/mBmV0ClFExlptiVm0t1ecxUSu9iEhABe0SioiIVFKAi4gElAJcRCSgFOAiIgGlABcRCSgFuIhIQCnARUQC6v8BG07ePfh3ZTkAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "x_axis = [1, 2, 3, 4]\n",
    "y_axis = [1, 4, 9, 16]\n",
    "plt.plot(x_axis,y_axis)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 199,
   "id": "4ef97687",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAPlklEQVR4nO3dbYxcV33H8e+vtls2QFmoF4jtpI6qslJLoE63KDQqBQJ1BFFipbxIpNCEUlmiEg8tcoSp1Kh9k6iuKG2RiqyQJpQ0gILrphRqIgKNKoHRJg44TwbU8uB1qDdEDrSswDH/vthxam+83tmZ8e6c9fcjrXzn3OO5/5OT/fnuuXf2pqqQJLXnZ5a7AElSbwxwSWqUAS5JjTLAJalRBrgkNWr1Uh5s7dq1tXHjxqU8pCQ17/7773+iqsbmti9pgG/cuJHJycmlPKQkNS/Jt0/V7hKKJDXKAJekRhngktQoA1ySGmWAS1KjFgzwJLcmOZzkoTnt70zyWJKHk/zFmStRktq1e98Ul9x8Lxe871+55OZ72b1vamDv3c1thLcBHwI+erwhyeuAK4FXVtWPk7x4YBVJ0gqxe98U23ftZ+boMQCmjsywfdd+ALZsWt/3+y94Bl5V9wFPzml+B3BzVf240+dw35VI0gqzY8+BZ8L7uJmjx9ix58BA3r/XNfCXAb+VZG+Sf0/yG/N1TLI1yWSSyenp6R4PJ0ntOXRkZlHti9VrgK8GXgRcDGwDPpkkp+pYVTuraqKqJsbGnvVJUElasdaNjiyqfbF6DfCDwK6a9RXgp8DagVQkSSvEts3jjKxZdVLbyJpVbNs8PpD37zXAdwOvA0jyMuBngScGUpEkrRBbNq3npqsuZP3oCAHWj45w01UXDuQCJnRxF0qSO4HXAmuTHARuBG4Fbu3cWvgT4Lry4ZqS9CxbNq0fWGDPtWCAV9U18+y6dsC1SJIWwU9iSlKjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMWDPAktyY53Hl82tx9701SSXygsSQtsW7OwG8DLpvbmOQ84HeA7wy4JklSFxYM8Kq6D3jyFLv+CrgB8GHGkrQMeloDT3IlMFVVXx1wPZKkLi34VPq5kpwDvJ/Z5ZNu+m8FtgKcf/75iz2cJGkevZyB/xJwAfDVJN8CNgAPJHnpqTpX1c6qmqiqibGxsd4rlSSdZNFn4FW1H3jx8dedEJ+oqicGWJckaQHd3EZ4J/AlYDzJwSRvP/NlSZIWsuAZeFVds8D+jQOrRpLUNT+JKUmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUd08E/PWJIeTPHRC244kjyX5WpJ/SjJ6RquUJD1LN2fgtwGXzWm7B3h5Vb0C+DqwfcB1SZIWsGCAV9V9wJNz2j5XVU93Xn4Z2HAGapMkncYg1sB/H/jsfDuTbE0ymWRyenp6AIeTJEGfAZ7kT4CngTvm61NVO6tqoqomxsbG+jmcJOkEq3v9i0muBy4HLq2qGlhFkqSu9BTgSS4DbgB+u6p+NNiSJEnd6OY2wjuBLwHjSQ4meTvwIeD5wD1JHkzy4TNcpyRpjgXPwKvqmlM0f+QM1CJJWgQ/iSlJjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmN6uaRarcmOZzkoRPaXpTkniTf6Pz5wjNbpiRprm7OwG8DLpvT9j7g81X1y8DnO68lSUtowQCvqvuAJ+c0Xwnc3tm+Hdgy2LIkSQvpdQ38JVX1eGf7e8BL5uuYZGuSySST09PTPR5OkjRX3xcxq6qAOs3+nVU1UVUTY2Nj/R5OktTRa4D/d5JzATp/Hh5cSZKkbvQa4HcD13W2rwP+eTDlSJK61c1thHcCXwLGkxxM8nbgZuCNSb4BvKHzWpK0hFYv1KGqrpln16UDrkWStAh+ElOSGmWAS1KjDHBJapQBLkmNWvAipqQ27N43xY49Bzh0ZIZ1oyNs2zzOlk3rl7ssnUEGuLQC7N43xfZd+5k5egyAqSMzbN+1H8AQX8FcQpFWgB17DjwT3sfNHD3Gjj0HlqkiLQUDXFoBDh2ZWVS7VgYDXFoB1o2OLKpdK4MBLq0A2zaPM7Jm1UltI2tWsW3z+DJVpKXgRUxpBTh+odK7UM4uBri0QmzZtN7APsu4hCJJjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIa1VeAJ/mjJA8neSjJnUmeM6jCJEmn13OAJ1kPvAuYqKqXA6uAqwdVmCTp9PpdQlkNjCRZDZwDHOq/JElSN3oO8KqaAv4S+A7wOPBUVX1ubr8kW5NMJpmcnp7uvVJJ0kn6WUJ5IXAlcAGwDnhukmvn9quqnVU1UVUTY2NjvVcqSTpJP0sobwD+q6qmq+oosAv4zcGUJUlaSD8B/h3g4iTnJAlwKfDoYMqSJC2knzXwvcBdwAPA/s577RxQXZKkBfT162Sr6kbgxgHVIklaBD+JKUmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY3qK8CTjCa5K8ljSR5N8upBFSZJOr2+HqkG/DXwb1X1liQ/C5wzgJokSV3oOcCTvAB4DXA9QFX9BPjJYMqSJC2knyWUC4Bp4O+T7EtyS5Lnzu2UZGuSySST09PTfRxOknSifgJ8NXAR8HdVtQn4X+B9cztV1c6qmqiqibGxsT4OJ0k6UT8BfhA4WFV7O6/vYjbQJUlLoOcAr6rvAd9NMt5puhR4ZCBVSZIW1O9dKO8E7ujcgfKfwNv6L0mS1I2+AryqHgQmBlOKJGkx/CSmJDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNarfR6qRZBUwCUxV1eX9l6QW7N43xY49Bzh0ZIZ1oyNs2zzOlk3rl7ss6azSd4AD7wYeBX5+AO+lBuzeN8X2XfuZOXoMgKkjM2zftR/AEJeWUF9LKEk2AG8GbhlMOWrBjj0Hngnv42aOHmPHngPLVJF0dup3DfyDwA3AT+frkGRrkskkk9PT030eTsPg0JGZRbVLOjN6DvAklwOHq+r+0/Wrqp1VNVFVE2NjY70eTkNk3ejIotolnRn9nIFfAlyR5FvAx4HXJ/nYQKrSUNu2eZyRNatOahtZs4ptm8eXqSLp7NRzgFfV9qraUFUbgauBe6vq2oFVpqG1ZdN6brrqQtaPjhBg/egIN111oRcwpSU2iLtQdBbasmm9gS0ts4EEeFV9EfjiIN5LktQdP4kpSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRPQd4kvOSfCHJI0keTvLuQRYmSTq9fp6J+TTw3qp6IMnzgfuT3FNVjwyoNknSafR8Bl5Vj1fVA53tHwKPAj6mXJKWyEDWwJNsBDYBe0+xb2uSySST09PTgzicJIkBBHiS5wGfAt5TVT+Yu7+qdlbVRFVNjI2N9Xs4SVJHXwGeZA2z4X1HVe0aTEmSpG70cxdKgI8Aj1bVBwZXkiSpG/2cgV8CvBV4fZIHO19vGlBdkqQF9HwbYVX9B5AB1iJJWgQ/iSlJjTLAJalRBrgkNcoAl6RG9fO7UJbE7n1T7NhzgENHZlg3OsK2zeNs2eQn9iVpqAN8974ptu/az8zRYwBMHZlh+679AIa4pLPeUC+h7Nhz4JnwPm7m6DF27DmwTBVJ0vAY6gA/dGRmUe2SdDYZ6gBfNzqyqHZJOpsMdYBv2zzOyJpVJ7WNrFnFts3jy1SRJA2Pob6IefxCpXehSNKzDXWAw2yIG9iS9GxDvYQiSZqfAS5JjTLAJalRBrgkNcoAl6RGpaqW7mDJNPDtHv/6WuCJAZaznBzL8Fkp4wDHMqz6GcsvVtXY3MYlDfB+JJmsqonlrmMQHMvwWSnjAMcyrM7EWFxCkaRGGeCS1KiWAnznchcwQI5l+KyUcYBjGVYDH0sza+CSpJO1dAYuSTqBAS5JjRqqAE9ya5LDSR6aZ3+S/E2Sbyb5WpKLlrrGbnUxltcmeSrJg52vP13qGruR5LwkX0jySJKHk7z7FH2amJcux9LKvDwnyVeSfLUzlj87RZ+fS/KJzrzsTbJxGUpdUJdjuT7J9Anz8gfLUWs3kqxKsi/Jp0+xb7BzUlVD8wW8BrgIeGie/W8CPgsEuBjYu9w19zGW1wKfXu46uxjHucBFne3nA18HfqXFeelyLK3MS4DndbbXAHuBi+f0+UPgw53tq4FPLHfdfYzleuBDy11rl+P5Y+AfT/X/0aDnZKjOwKvqPuDJ03S5EvhozfoyMJrk3KWpbnG6GEsTqurxqnqgs/1D4FFg7i9ob2JeuhxLEzr/rf+n83JN52vuHQlXArd3tu8CLk2SJSqxa12OpQlJNgBvBm6Zp8tA52SoArwL64HvnvD6II1+A3a8uvNj42eT/OpyF7OQzo97m5g9QzpRc/NymrFAI/PS+VH9QeAwcE9VzTsvVfU08BTwC0taZJe6GAvA73aW6O5Kct7SVti1DwI3AD+dZ/9A56S1AF9JHmD29xu8EvhbYPfylnN6SZ4HfAp4T1X9YLnr6ccCY2lmXqrqWFX9GrABeFWSly9zST3rYiz/AmysqlcA9/D/Z7FDI8nlwOGqun+pjtlagE8BJ/7Lu6HT1pyq+sHxHxur6jPAmiRrl7msU0qyhtnAu6Oqdp2iSzPzstBYWpqX46rqCPAF4LI5u56ZlySrgRcA31/S4hZpvrFU1fer6sedl7cAv77EpXXjEuCKJN8CPg68PsnH5vQZ6Jy0FuB3A7/XuevhYuCpqnp8uYvqRZKXHl/7SvIqZudi6L65OjV+BHi0qj4wT7cm5qWbsTQ0L2NJRjvbI8AbgcfmdLsbuK6z/Rbg3upcPRsm3YxlzjWVK5i9fjFUqmp7VW2oqo3MXqC8t6qundNtoHMyVA81TnIns3cBrE1yELiR2QsaVNWHgc8we8fDN4EfAW9bnkoX1sVY3gK8I8nTwAxw9TB+czF7VvFWYH9njRLg/cD50Ny8dDOWVublXOD2JKuY/Ufmk1X16SR/DkxW1d3M/mP1D0m+yewF9auXr9zT6mYs70pyBfA0s2O5ftmqXaQzOSd+lF6SGtXaEookqcMAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY36P+mAJZHGlQEeAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "x_axis = [1, 2, 3, 4]\n",
    "y_axis = [1, 4, 9, 16]\n",
    "plt.scatter(x_axis,y_axis)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 200,
   "id": "392dc281",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD5CAYAAADcDXXiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAANbElEQVR4nO3df6zd9V3H8edLOqzQbPzoTcNasE0gM0SdY3cMg1lwnQmMaVEWglm2hhCbmKnoJFJNDBgTA8kiYDRbmsHWLQsD2UxxKAur1GVGkNuCdNABDT+LhV6ywcYYbt3e/nG/05vrve2953tuz+2nz0dyc8735/kcDt/n/ebbe85JVSFJastPjXoAkqThM+6S1CDjLkkNMu6S1CDjLkkNWjbqAQCsXLmy1q5dO+phSNJRZefOnS9X1dhsy5ZE3NeuXcvExMSohyFJR5Ukz861zMsyktQg4y5JDTLuktQg4y5JDTLuktQg4y5JDTps3JPcmuRAkm9Mm3dKknuTPNndntzNT5K/SbI3ySNJzlnMwUuSZjefM/fPABfOmLcZ2F5VZwHbu2mAi4Czup9NwCeGM0xJ0kIcNu5V9TXgWzNmbwC2dve3ApdMm//ZmnI/cFKS04Y0VknSPA36DtVVVbW/u/8isKq7vxp4ftp6+7p5+5khySamzu4544wzBhyGpCNh7ea7Rz2EZj1z/cWLst/e/6BaU1/ltOCvc6qqLVU1XlXjY2OzfjSCJGlAg8b9pZ9cbuluD3TzXwBOn7bemm6eJOkIGjTudwEbu/sbgW3T5n+k+6uZ84BXp12+kSQdIYe95p7kNuACYGWSfcC1wPXAHUmuBJ4FLutW/yfg/cBe4HXgikUYsyTpMA4b96r67TkWrZ9l3QI+2ndQkqR+fIeqJDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg4y7JDXIuEtSg3rFPckfJXk0yTeS3JZkeZJ1SR5IsjfJ7UmOH9ZgJUnzM3Dck6wG/gAYr6qfB44DLgduAG6sqjOBbwNXDmOgkqT563tZZhnwM0mWAScA+4H3And2y7cCl/R8DEnSAg0c96p6Afg48BxTUX8V2Am8UlUHu9X2Aatn2z7JpiQTSSYmJycHHYYkaRZ9LsucDGwA1gFvBU4ELpzv9lW1parGq2p8bGxs0GFIkmbR57LM+4Cnq2qyqn4IfAk4Hzipu0wDsAZ4oecYJUkL1CfuzwHnJTkhSYD1wGPAfcAHu3U2Atv6DVGStFB9rrk/wNQ/nO4Cdnf72gJcA3wsyV7gVOCWIYxTkrQAyw6/ytyq6lrg2hmznwLO7bNfSVI/vkNVkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQcZdkhpk3CWpQb3inuSkJHcm+WaSPUl+OckpSe5N8mR3e/KwBitJmp++Z+43A/dU1c8Bbwf2AJuB7VV1FrC9m5YkHUEDxz3JW4D3ALcAVNUPquoVYAOwtVttK3BJvyFKkhaqz5n7OmAS+HSSh5J8KsmJwKqq2t+t8yKwqu8gJUkL0yfuy4BzgE9U1TuA7zHjEkxVFVCzbZxkU5KJJBOTk5M9hiFJmqlP3PcB+6rqgW76TqZi/1KS0wC62wOzbVxVW6pqvKrGx8bGegxDkjTTwHGvqheB55O8rZu1HngMuAvY2M3bCGzrNUJJ0oIt67n97wOfT3I88BRwBVO/MO5IciXwLHBZz8eQJC1Qr7hX1cPA+CyL1vfZrySpH9+hKkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkNMu6S1CDjLkkN6h33JMcleSjJl7vpdUkeSLI3ye1Jju8/TEnSQgzjzP0qYM+06RuAG6vqTODbwJVDeAxJ0gL0inuSNcDFwKe66QDvBe7sVtkKXNLnMSRJC9f3zP0m4E+AH3fTpwKvVNXBbnofsHq2DZNsSjKRZGJycrLnMCRJ0w0c9yQfAA5U1c5Btq+qLVU1XlXjY2Njgw5DkjSLZT22PR/4jSTvB5YDbwZuBk5Ksqw7e18DvNB/mJKkhRj4zL2q/rSq1lTVWuBy4F+q6kPAfcAHu9U2Att6j1KStCCL8Xfu1wAfS7KXqWvwtyzCY0iSDqHPZZn/VVU7gB3d/aeAc4exX0nSYHyHqiQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoMGjnuS05Pcl+SxJI8muaqbf0qSe5M82d2ePLzhSpLmo8+Z+0Hgj6vqbOA84KNJzgY2A9ur6ixgezctSTqCBo57Ve2vql3d/e8Ce4DVwAZga7faVuCSnmOUJC3QUK65J1kLvAN4AFhVVfu7RS8Cq+bYZlOSiSQTk5OTwxiGJKnTO+5JVgBfBP6wqr4zfVlVFVCzbVdVW6pqvKrGx8bG+g5DkjRNr7gneRNTYf98VX2pm/1SktO65acBB/oNUZK0UH3+WibALcCeqvrraYvuAjZ29zcC2wYfniRpEMt6bHs+8GFgd5KHu3l/BlwP3JHkSuBZ4LJeI5QkLdjAca+qrwOZY/H6QfcrSerPd6hKUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoOMuyQ1yLhLUoP6fFmHNJC1m+8e9RCa9cz1F496CFoiPHOXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYZd0lqkHGXpAYtStyTXJjk8SR7k2xejMeQJM1t6F+zl+Q44O+AXwP2AQ8muauqHhv2Y4Ff2baY/Mo26ei1GGfu5wJ7q+qpqvoB8AVgwyI8jiRpDovxBdmrgeenTe8D3j1zpSSbgE3d5GtJHl+EsSxFK4GXRz2I+cgNox7BknDUvF7ga9Y5ll6zn51rwWLEfV6qaguwZVSPPypJJqpqfNTj0Pz4eh19fM2mLMZlmReA06dNr+nmSZKOkMWI+4PAWUnWJTkeuBy4axEeR5I0h6Fflqmqg0l+D/gKcBxwa1U9OuzHOYodc5eijnK+XkcfXzMgVTXqMUiShsx3qEpSg4y7JDXIuA9BkpH9Sal0rPA4WxjjPg9J/rz7rJyvJ7ktydVJdiS5KckEcFWS9UkeSrI7ya1Jfrrb9pkkK7v740l2dPevS/K5JP+e5Mkkv3OIx1+RZHuSXd3+fcfvEpfkgiRfHvU4jiajPs669a/p9v2fSa5f7Oe8mPxNeBhJ3gVcCrwdeBOwC9jZLT6+qsaTLAeeBNZX1RNJPgv8LnDTYXb/i8B5wInAQ0nurqr/mmW9N4DfrKrvdP8D3999Xo//Gj6gJMuq6uCox6EpS+E4S3IRUx+V8u6qej3JKUN4aiPjmfvhnQ9sq6o3quq7wD9OW3Z7d/s24OmqeqKb3gq8Zx773lZV36+ql4H7mPpcntkE+KskjwBfZeojHlYt8HkcU5bCWSDw5iR3d+P4ZBKPt7kthePsfcCnq+p1gKr61kKfxFLimXs/35vHOgf5v1+iy2csm3nmPdeZ+IeAMeCdVfXDJM/Msi91lsJZYOdc4GzgWeAe4LeAOwd+YseuI3WcNcUzicP7N+DXkyxPsgL4wCzrPA6sTXJmN/1h4F+7+88A7+zuXzpjuw3dfk8FLmDq3b2zeQtwoAv7r3KIDwsSsDTOAgH+o/t01B8BtwG/sqBncWxZCsfZvcAVSU4A8LJM46rqQaY+PuER4J+B3cCrM9Z5A7gC+Psku4EfA5/sFv8FcHN3KeBHM3b/CFOBuB/4y0OcAX4eGO/2/RHgm32f1zHsSJ4FHpNnjINYCsdZVd3TjWEiycPA1f2f2QhVlT+H+QFWdLcnABPAOUPY53XA1aN+bi3+AO9i6lLMcmAF8ARTB+oOYLxbZznwHHBmN/0Z4Kru/leBi7r7NwI7pr1mD3fbntpt/9Y5xnAB8H1gHVO/KL4CXDrq/zZL+cfjbLg/nrnPz5buN/ku4ItVtWvE49Eh1BI4C+w8CPwtsAd4GviHHk/rWOBxNkR+tswSkuQXgM/NmP3fVfX/vuxEh5ZkRVW91l0//RqwqW8sklwHvFZVHx/GGDUax8px5l/LLCFVtRv4pVGPoxFbkpzN1CWUrZ4F6ieOlePMM3eph2PlLFBHH+MuSQ3yH1QlqUHGXZIaZNwlqUHGXZIa9D/MZoe+b3yiogAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "x_axis = ['group_a', 'group_b', 'group_c']\n",
    "y_axis = [1, 10, 100]\n",
    "\n",
    "plt.bar(x_axis, y_axis)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 201,
   "id": "fe0f02a6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAD4CAYAAADxeG0DAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAMg0lEQVR4nO3db4hldR3H8c+naWKjLF12RHHdpugPF6Y/wlUCt2JMQyqyBz1woSi6sBC0GAiS3QfpgwtRUD0oiKHrg0huBFqJGKU0Fhfyz93NLHcsJLKUwpEMk1gd128P9q6s06xzZ89v58z33PcLLsw998w5X/bsfvz5Pef3u44IAQDyek3dBQAAqiHIASA5ghwAkiPIASA5ghwAknttHSfds2dPzM/P13FqAEjr8OHDT0fE3PrttQT5/Py8RqNRHacGgLRsP77RdlorAJAcQQ4AyRHkAJAcQQ4AyRHkAJAcQQ4AyRHkAJAcQQ4AydUyIQgAzoTtysdo4ncwEOQA0tgshG03Mqg3Q2sFAJIjyAEgOYIcAJIjyAEgOYIcAJIjyAEgOYIcAJIjyAEgOYIcAJIjyAEgOYIcAJJjrRVMlRKLLknNXHgJeRHkmCosuoQmKtZasT1j+3e27yx1TADA5kr2yK+TtFLweACACRQJctt7JX1M0vdLHA8AMLlSI/JvS7pB0kun28H2Qdsj26PV1dVCpwUAVA5y2x+X9FREHH61/SJiKSLaEdGem5ureloAwFiJEfnlkj5h+6+SfiTpCts/LHBcAMAEKgd5RNwYEXsjYl7StZJ+FRGfrlwZAGAizOwEgOSKTgiKiHsl3VvymACAV8eIHACSI8gBIDmCHACSI8gBIDmCHACSI8gBIDmCHACSI8gBIDmCHACSI8gBIDmCHACSI8gBIDmCvKDBYKCFhQXNzMxoYWFBg8Gg7pIATIGiqx9Os8FgoG63q36/r/3792s4HKrT6UiSDhw4UHN1AJqMEXkhvV5P/X5fi4uLmp2d1eLiovr9vnq9Xt2lAWg4R8S2n7TdbsdoNNr2855NMzMzOnbsmGZnZ1/etra2pl27dun48eM1VoatsK06/k2gjKZfP9uHI6K9fjsj8kJarZaGw+Ertg2HQ7VarZoqAjAtCPJCut2uOp2OlpeXtba2puXlZXU6HXW73bpLA9Bw3Ows5OQNzUOHDmllZUWtVku9Xo8bnQDOOnrkwCma3mNtuqZfP3rkANBQBDkAJEeQA0ByBDkAJEeQA0BylYPc9sW2l20ftf2I7etKFAYAmEyJ58hflHR9RByxfY6kw7bvjoijBY4NANhE5RF5RPwjIo6Mf/6PpBVJF1U9LgBgMkV75LbnJV0i6f6SxwUAnF6xKfq23yjpNklfiohnN/j8oKSDkrRv375Sp912toscp8mzzwBsryIjctuzOhHit0bE7RvtExFLEdGOiPbc3FyJ09YiIjZ9TbIfAJRS4qkVS+pLWomIb1YvCQCwFSVG5JdL+oykK2w/NH59tMBxAQATqNwjj4ihpDKNYwDAljGzEwCSI8gBIDmCHACSI8gBIDmCHACSI8gBIDmCHI2ye/du2T7jl6RKv29bu3fvrvlPAdOm2ForwE7wzDPP1L4EQqn1eIBJMSIHgOQIcgBIjiAHsCNUvb8xzfc46JED2BF2wv0NKec9DkbkAJAcQQ4AyRHkAJAcQQ4AyRHkAJAcQQ4AyRHk6+yEZ1kzPscKoD48R77OTniWNeNzrADqw4gcAJIjyAEgOYIcAJIjyAEgOYIcAJIjyAEguSJBbvtq23+y/ZjtL5c4JgBgMpWfI7c9I+m7kq6S9ISkB23fERFHqx67DvHVN0k3vbn+GgBgQiUmBF0m6bGI+Isk2f6RpGskpQxy3/zsjpgQFDfVWgKAREq0Vi6S9PdT3j8x3vYKtg/aHtkera6uFjgtAEDaxin6EbEkaUmS2u12/d/nBGBH2QltzZfrSKZEkD8p6eJT3u8dbwOAie2EtqaUs7VZorXyoKR32H6r7ddJulbSHQWOCwCYQOUReUS8aPuLkn4haUbSLRHxSOXKAAATKdIjj4i7JN1V4lhAFTuhz5qxx4rcWI8cjbIT+qwZe6zIjSn6AJAcQQ4AyRHkAJAcPfIN1P2dmeedd16t5weQC0G+TokbZbZrv+EGYHrQWgGA5AhyAEiOIAeA5AhyAEiOIAeA5AhyAEiOIAeA5AhyAEiOIAeA5AhyAEiOIAeA5AhyAEiOIAeA5Fj9EI3DMsSYNgQ5GqXq8sEsQYyMaK0AQHIEOQAkR5ADQHIEOQAkR5ADQHKVgtz2N2w/avth2z+xfW6hugAAE6o6Ir9b0kJEvEfSnyXdWL0kANPKdu2vjPMAKj1HHhG/POXtfZI+Va0cANOqxPP70zoPoGSP/POSfn66D20ftD2yPVpdXS14WgCYbpuOyG3fI+mCDT7qRsTPxvt0Jb0o6dbTHSciliQtSVK73Z6+/2QCwFmyaZBHxJWv9rntz0n6uKQPxxT8P82k63hstt8U/FEB2CaVeuS2r5Z0g6QPRcR/y5S0sxHAAHaaqj3y70g6R9Ldth+y/b0CNaU1GAy0sLCgmZkZLSwsaDAY1F0SgClQ9amVt5cqJLvBYKBut6t+v6/9+/drOByq0+lIkg4cOFBzdQCajJmdhfR6PfX7fS0uLmp2dlaLi4vq9/vq9Xp1lwag4VxHz7fdbsdoNNr2855NMzMzOnbsmGZnZ1/etra2pl27dun48eM1VoatmNbnkJui6dfP9uGIaK/fzoi8kFarpeFw+Iptw+FQrVarpooATAuCvJBut6tOp6Pl5WWtra1peXlZnU5H3W637tIANBxf9VbIyRuahw4d0srKilqtlnq9Hjc6AZx19MiBUzS9x9p0Tb9+9MgBoKEI8oKYEASgDvTIC2FCEIC6MCIvhAlBAOrCzc5CmBDUDE2/WdZ0Tb9+3Ow8y5gQBKAuBHkhTAgCUBdudhbChCAAdaFHDpyi6T3Wpmv69aNHDgANRZADQHIEOQAkR5ADQHIEOQAkR5ADQHIEOQAkR5ADQHIEOQAkR5ADQHIEOQAkVyTIbV9vO2zvKXE8AMDkKge57YslfUTS36qXAwDYqhIj8m9JukFSc5ccA4AdrNJ65LavkfRkRPze9mb7HpR0UJL27dtX5bTAGdvs7+mk+zR5qVTks2mQ275H0gUbfNSV9BWdaKtsKiKWJC1JJ9Yj30KNQDEEMJpo0yCPiCs32m773ZLeKunkaHyvpCO2L4uIfxatEgBwWmfcWomIP0g6/+R723+V1I6IpwvUBQCYEM+RA0Byxb58OSLmSx0LADA5RuQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJEeQAkBxBDgDJVQ5y24dsP2r7EdtfL1EUAGByr63yy7YXJV0j6b0R8bzt88uUBQCYVNUR+RckfS0inpekiHiqekkAgK2oGuTvlPQB2/fb/rXtS0+3o+2Dtke2R6urqxVPCwA4adPWiu17JF2wwUfd8e/vlvR+SZdK+rHtt0VErN85IpYkLUlSu93+v88BAGdm0yCPiCtP95ntL0i6fRzcD9h+SdIeSQy5AWCbVG2t/FTSoiTZfqek10l6uuIxAQBbUOmpFUm3SLrF9h8lvSDpsxu1VQAAZ0+lII+IFyR9ulAtAIAzUHVEDgDbxnblfZrYNCDIAaTRxBAugbVWACA5ghwAkiPIASA5ghwAkiPIASA5ghwAkiPIASA5ghwAknMdD9jbXpX0+LafePvsEYuHZcW1y63p1+8tETG3fmMtQd50tkcR0a67Dmwd1y63ab1+tFYAIDmCHACSI8jPjqW6C8AZ49rlNpXXjx45ACTHiBwAkiPIASA5grwg27fYfmr8HaZIxPbFtpdtH7X9iO3r6q4Jk7O9y/YDtn8/vn43113TdqJHXpDtD0p6TtIPImKh7nowOdsXSrowIo7YPkfSYUmfjIijNZeGCfjE97u9ISKesz0raSjpuoi4r+bStgUj8oIi4jeS/lV3Hdi6iPhHRBwZ//wfSSuSLqq3KkwqTnhu/HZ2/JqaUSpBDqxje17SJZLur7kUbIHtGdsPSXpK0t0RMTXXjyAHTmH7jZJuk/SliHi27nowuYg4HhHvk7RX0mW2p6a9SZADY+Pe6m2Sbo2I2+uuB2cmIv4taVnS1TWXsm0IckAv3yzrS1qJiG/WXQ+2xvac7XPHP79e0lWSHq21qG1EkBdkeyDpt5LeZfsJ2526a8LELpf0GUlX2H5o/Ppo3UVhYhdKWrb9sKQHdaJHfmfNNW0bHj8EgOQYkQNAcgQ5ACRHkANAcgQ5ACRHkANAcgQ5ACRHkANAcv8DH8iHbHFqVmsAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "#make there Normalize arrays\n",
    "datas = [np.random.normal(0,std,100) for std in range(1,4)]\n",
    "\n",
    "plt.boxplot(datas) \n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 202,
   "id": "9b7ecb99",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX8AAAEXCAYAAABF40RQAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAhtUlEQVR4nO3deXxU1fnH8c+juIPiArgiLj8XXFCaWq0gFFwQLajFrdqCUtHWaq1r1dZqXeouqHVBQEAQrYAtAqIUcUMUIwoCWncBiyXgggJKIM/vjzNppzEhkzAzZ2bu9/165cXM3Du5z+Umz5yce55zzN0REZFkWS92ACIikn9K/iIiCaTkLyKSQEr+IiIJpOQvIpJASv4iIgmk5C8ikkBK/lJyzOwjMzvczK4ws0GRYhhqZtfFOLZIJprEDkAkV9z9htgxZIuZXQ3s7u6nx45FSoNa/iIFzszUSJOsU/KXkmVmV5vZiNTjNmbmZtbbzOab2RIzuzJt3/XM7Hdm9r6ZLTWzv5rZVhkco4OZvWRmX5jZAjPrk7Z5SzObYGZfmdkrZrZb2vsGpPZfZmavmVnHGnGPNrMRZrYMOAe4AjjZzL42s1lZ+O+RhFPyl6TpAOwJdAWuMrO9U6+fBxwHdAK2Bz4H/rK2b2RmOwNPAncBLYADgDfSdjkFuAbYEngPuD5t26up/bcCHgYeM7ON07b3BEYDzYHBwA3Ao+7e1N3bZXy2InVQ8pekucbdV7r7LGAWUJ1IzwGudPeF7v4tcDXQq54ul58C/3D3Ue5e6e5L3f2NtO2Pu/sMd18NjCQkewDcfURq/9XufhuwEeFDqdp0d/+bu1e5+8p1PGeR71BfoiTNp2mPVwBNU493Bh43s6q07WuAVsAndXyvnYD3G3EszOxioC/hrwwHNge2Sdt/wVq+r8g6U8tfJFgAHO3uzdO+Nnb3uhJ/9Xt2W8v2WqX69y8FTgK2dPfmwJeApe1Wc651zb0uWaXkLxLcB1yf6sfHzFqYWc963jMSONzMTjKzJma2tZkdkMGxmgGrgQqgiZldRWj5r82/gTZmpt9ZyQr9IIkEA4BxwNNm9hXwMvCDtb3B3ecD3YGLgM8IN3szuRn7FDAJeAf4GPiG+rt5Hkv9u9TMZmZwDJG1Mq3kJSKSPGr5i4gkkJK/yFqY2WmpwqqaX3NjxyayLtTtIyKSQEUzzn+bbbbxNm3axA5DRKSovPbaa0vcvUXN14sm+bdp04by8vLYYYiIFBUz+7i219XnLyKSQEr+IiIJpOQvIpJASv4iIgmk5C8ikkA5Tf5mNsTMFpvZnBqvn2dmb5vZXDO7OZcxiIjId+W65T8U6Jb+gpn9iLBKUTt33we4NccxiIhIDTlN/u7+PGG2w3S/BG5MrZaEuy/OZQwiIkXr66/hz3+GNWuy/q1j9PnvAXRMLWj9nJl9v64dzayfmZWbWXlFRUUeQxQRiWzJEujaFX7/e3jppax/+xjJvwlh0eqDgUuAv5qZ1bajuw909zJ3L2vR4jvVySIipenjj6FDB5g9Gx5/HDp2zPohYkzvsBAY62FGuRmpNVO3IaxqJCKSbHPmQLduocvn6adzkvghTsv/b8CPAMxsD2BDYEmEOERECsu0aSHZV1XBCy/kLPFD7od6jgKmA3ua2UIz6wsMAXZNDf98BOjtmldaRJLuiSfg8MOhZcvQx7/ffjk9XE67fdz91Do2nZ7L44qIFJUHH4SzzoIDD4SJEyEP9zhV4SsiEos73HQTnHkmdOkCU6fmJfGDkr+ISBxVVXDRRfC738Gpp8L48dC0ad4OXzSLuYiIlIxVq0Jrf+RIOP98uOMOWC+/bXElfxGRfFq+HH7yE3jqKbjhhtDyr73UKaeU/EVE8mXJEjjmGCgvh0GDoG/faKEo+YuI5MPHH8NRR4V/x46Fnj2jhqPkLyKSa3PnhsSf46rdhtBoHxGRXJo2LczTk4eq3YZQ8hcRyZXx4/NatdsQSv4iIrkwdCgcdxzsuy+8+CK0aRM5oP+l5C8ikk3ucPPNcMYZoWr3mWfyVrXbEEr+IiLZUlUFF18Ml10Gp5wSun2aNYsdVa002kdEJBsqK0PV7ogRcN550L9/3qt2G0LJX0RkXS1fDr16waRJcP31cPnlUap2G0LJX0RkXaRX7T7wAPziF7EjyoiSv4hIY82fD0ceWTBVuw2h5C8i0hgFWLXbELlexnGImS1OLdlYc9tFZuZmtk0uYxARybqXXvrvWrvPP190iR9yP9RzKNCt5otmthNwJDA/x8cXEcmu6qrdbbYJHwL77x87okbJafJ39+eBz2rZdAdwKaCF20WkeFRX7e6zT5izp8Cqdhsi74NQzawn8Im7z8r3sUVEGiW9avdHPyrYqt2GyOsNXzPbFLiC0OWTyf79gH4ArVu3zmFkIiJ1qKqCSy6B228PVbvDhsGGG8aOap3lu+W/G7ALMMvMPgJ2BGaa2ba17ezuA929zN3LWhT5p6yIFKHKSujdOyT+884La+6WQOKHPLf83f1NoGX189QHQJm7L8lnHCIi9SrCqt2GyPVQz1HAdGBPM1toZvEWrBQRydTSpdC1axi//8ADcMUVJZX4Icctf3c/tZ7tbXJ5fBGRBps/PxRvffghjBkTRveUIFX4iohUq1m1e9hhsSPKmcKdb1REJJ+qq3bXrAlVuyWc+EHJX0QEJkwoiardhlDyF5FkGzYszMbZtm1Ya3eXXWJHlBdK/iKSXLfcAn36hKrdqVOhZct631IqlPxFJHmq19q99FI4+eTQ7VOga+3mikb7iEiyFNlau7mi5C8iyZFetXvddSVZvJUpJX8RSYalS8Nau6++CgMHwllnxY4oKiV/ESl9CanabQglfxEpbXPnQrdu8NVXJV+12xDJu8shIslRXbW7enUiqnYbQslfREpTddXu1lsnpmq3IZT8RaT0pFftTpuWmKrdhlDyF5HSUl2127lz4qp2G0LJX0RKg6p2G0SjfUSk+FVWQt++8NBD8Otfw4ABiazabQglfxEpbsuXw4knwpNPJr5qtyFyvYbvEDNbbGZz0l67xczeNrPZZva4mTXPZQwiUsKq19p96qlQtXvllUr8Gcr130VDgW41XpsM7Ovu+wPvAJfnOAYRKUXz50OHDvDGGzB6dOKna2ionCZ/d38e+KzGa0+7++rU05eBHXMZg4iUoHnz4NBD4V//Cq3+44+PHVHRiX1H5Ezgybo2mlk/Mys3s/KKioo8hiUiBWv69NDir67a7dQpdkRFKVryN7MrgdXAyLr2cfeB7l7m7mUtWrTIX3AiUpgmTAh9/NVVu+3axY6oaEVJ/mbWBzgWOM3dPUYMIlJkhg9X1W4W5T35m1k34FKgh7uvyPfxRaQI3XIL9O6tqt0syvVQz1HAdGBPM1toZn2Bu4FmwGQze8PM7stlDCJSxNKrdk86SVW7WZTTIi93P7WWlwfn8pgiUiJUtZtTqvAVkcKTXrV77bUq3soBJX8RKSxLl8Kxx8KMGVprN4eU/EWkcCxYENba/eCDULWr4q2cUfIXkcIwb15I/MuWhapdFW/llO6eiEh8qtrNOyV/EYkrvWp32jRV7eaJkr+IxFNdtbv33iHx77pr7IgSQ8lfROK49VZV7Uak5C8i+VVdtXvJJf+t2t1889hRJY5G+4hI/qRX7Z57bqjaXX/92FElklr+IpIfy5fDcceFxP+nP8FddynxR6SWv4jkXnrV7v33Q79+sSNKPCV/Ecmt9Krdxx6DE06IHZGg5C8iuaSq3YKlPn8RyY3p06FjR1XtFiglfxHJvokTQ9XuVlupardAKfmLSHYNHw49eoSq3RdfVNVugcr1Mo5DzGyxmc1Je20rM5tsZu+m/t0ylzGISB5VV+126hSqdlu1ih2R1CHXLf+hQLcar/0OmOLu/wdMST0XkWJWVRUqdi+5JKzANXGiqnYLXE6Tv7s/D3xW4+WewLDU42HAcbmMQURyrLISzjgjtPrPPRdGjYKNNoodldQjRp9/K3dflHr8KVDn34Vm1s/Mys2svKKiIj/RiUjmqqt2hw9X1W6RiXrD190d8LVsH+juZe5e1qJFizxGJiL1WroUDj8cJk2C++6DP/xBi6wXkRhFXv82s+3cfZGZbQcsjhCDiKyL6qrd999X1W6RitHyHwf0Tj3uDfw9Qgwi0lhvvQU//CF88kmo2lXiL0q5Huo5CpgO7GlmC82sL3AjcISZvQscnnouIsWgeq3dykp47rmwEIsUpYy6fcxsU+AioLW7n2Vm/wfs6e7j1/Y+dz+1jk1dGxamiEQ3cSL06gXbbw9PP63irSKXacv/QeBb4JDU80+A63ISkYgUnoceClW7e+2ltXZLRKbJfzd3vxmoBHD3FYBu64skwW23wc9/Hqp2n31WVbslItPkv8rMNiE1LNPMdiP8JSAipaq6avfii1W1W4IyHer5R2ASsJOZjQQOBfrkKigRiayyEn7xi1C89atfwZ13qnirxGSU/N19spnNBA4mdPf8xt2X5DQyEYlj+XI46aTQ0r/mGhVvlai1Jn8za1/jpeppGVqbWWt3n5mbsEQkis8+g2OOCWvt3ncfnH127IgkR+pr+d+2lm0OdMliLCISk6p2E2Wtyd/df5SvQEQkEvcwE+f554e+/qeeUvFWAmQ02sfMNjazC81srJmNMbMLzGzjXAcnIjm2cGEYv3/aabD77qGCV4k/ETId6jkc2Ae4C7g79fihXAUlIjlWVQX33w9t28KUKXD77aF4q23b2JFJnmQ61HNfd0//qZhqZvNyEZCI5Ni778JZZ4W5ebp0gQceUMVuAmXa8p9pZgdXPzGzHwDluQlJRHJi9eqw2tb++8Prr4ek/49/KPEnVH1DPd8kjOrZAHjJzOannu8MvJ378EQkK2bPhr59obw89PHfcw/ssEPsqCSi+rp9js1LFCKSG99+CzfcEL623BIefTRM1aCircSrb6jnx+nPzawloFE+IsXg5ZdDa3/ePDj9dOjfH7beOnZUUiAyHerZI7X4yofAc8BHwJM5jEtEGmv5crjwwrDa1rJlMGFCmJJZiV/SZHrD91rCvD7vuPsuhMVYXs5ZVCLSOFOmwH77wR13wDnnwNy50L177KikAGWa/CvdfSmwnpmt5+5TgbIcxiUiDfHFF2EWzsMPhyZNwjDOe+7RFMxSp0yT/xdm1hR4HhhpZgOA5etyYDP7rZnNNbM5ZjZKFcMijfT3v4firKFD4bLLYNYsOOyw2FFJgcs0+fcEVgK/Jczr/z7w48Ye1Mx2AM4Hytx9X2B94JTGfj+RRPr3v+Hkk+G446BlS3jlFbjxRthkk9iRSRHIdD7/9Fb+sCweexMzqwQ2Bf6Vpe8rUtrcYcQIuOAC+PpruO46uPRS2GCD2JFJEamvyOsrUks31twEuLs3qkPR3T8xs1uB+YS/KJ5296drOX4/oB9A69atG3MokdIyf364kfvkk3DIITB4MOy9d+yopAittdvH3Zu5++a1fDVLT/xmtmVDDpravyewC7A9sJmZnV7L8Qe6e5m7l7Vo0aIhhxApLVVV4QbuPvuEm7kDBsALLyjxS6Nl2udfnykN3P9w4EN3r3D3SmAs8MMsxSJSWt55J0yzfO65obU/Z06Ye19r6so6yFbyb2it+HzgYDPb1MyMUDfwVpZiESkNq1fDTTeFidjefBMefDAstLLLLrEjkxKQ6ZTO9antvkDdO7u/YmajgZnAauB1YGCWYhEpfrNmwZlnwsyZcPzx8Je/wHbbxY5KSki2Wv4N5u5/dPe93H1fd/+Zu38bKxaRgvHNN/D730NZGXzyCYweDWPHKvFL1q01+ZvZRDNrk8H30RSBIuvqpZfgwAPh+uvDsorz5sFPfhI7KilR9bX8HwSeNrMrzWxtg4i7ZjEmkWT5+utwA7dDB1ixAiZNCtW6W20VOzIpYfVN6fyYmT0J/AEoN7OHgKq07ben/v0sp1GKlKqnn4Z+/cL4/XPPDfPuN2sWOypJgExu+K4izOOzEdCMtOQvIo30+edh2uWhQ2HPPeH550PLXyRP6qvw7QbcDowD2rv7irxEJVLKxo4NrfyKCrj8crjqKthY8xpKftXX8r8SONHd5+YjGJGS9umn8Otfw5gxcMABMHFiuMErEkF90zt0VOIXWUfuoXunbVsYPz7068+YocQvUWWryEtEavPRR3D22eHG7qGHwqBBsNdesaMSiVfkJVLSqqrgrrtg333D+P277w43dZX4pUCo5S+SbW+/HZZUnDYNjjoK7r8fdt45dlQi/0Mtf5FsqawM/fnt2oXq3GHDwrz7SvxSgNTyF8mGmTOhb1944w3o1St087RqFTsqkTqp5S+yLlauDGP1DzooDOUcMwYee0yJXwqeWv4ijfXii6G1/847cMYZcNttsGWDFrUTiUYtf5GG+uqrUKzVsSOsWhWGcQ4ZosQvRUXJX6QhJk0KwzfvuQd+85uwwtYRR8SOSqTBlPxFMrF0KfTuDUcfDZttFoZx9u8PTZvGjkykUaIlfzNrbmajzextM3vLzA6JFYtIndzDDdy2beHhh8MqW6+/HhZSFyliMW/4DgAmuXsvM9sQ2DRiLCLftWgR/OpX8Le/wfe+F/r227WLHZVIVkRp+ZvZFsBhwGAAd1/l7l/EiEXkO9zDDdy99w59/DfdBC+/rMQvJSVWt88uQAXwoJm9bmaDzGyzmjuZWT8zKzez8oqKivxHKcnz4Ydw5JFhCGe7djBrFlx6KTTRqGgpLbGSfxOgPXCvux9IWCnsdzV3cveB7l7m7mUtWrTId4ySJGvWwIABYSTPK6/AvffC1Kmwxx6xIxPJiVjJfyGw0N1fST0fTfgwEMm/efPCEooXXACdOsHcuXDOObCeBsNJ6Yry0+3unwILzGzP1EtdgXkxYpEEW7UKrr02LKry7rswYgRMmAA77RQ7MpGci9mReR4wMjXS5wPgjIixSNKUl4d+/dmz4eST4c47oWXL2FGJ5E205O/ubwBlsY4vCbVyJfzxj2EenlatwjDOnj1jRyWSdxrCIMnx3HNhkZX33oOzzoKbb4bmzWNHJRKF7mhJ6Vu2DH75S+jcOSyvOGUKDByoxC+JpuQvpW3CBNhnn5DsL7ww9PF36RI7KpHolPylNC1ZAqefDsceC5tvHhZRv+22MCmbiCj5S4lxh0ceCVMzPPpouLk7cyb84AexIxMpKLrhK6Xjk0/CRGzjxkFZWZifZ7/9YkclUpDU8pfi5w4PPBCmXZ48GW69FaZPV+IXWQu1/KW4vf9+GLY5dWoYzfPAA7D77rGjEil4avlLcVqzBm6/PbTuX3sN7r8/DOFU4hfJiFr+UnzmzAlTM8yYEUbz3Hsv7Lhj7KhEiopa/lI8Vq2Ca66B9u3hgw/CsorjxinxizSCWv5SHGbMCK39OXPgpz8Ni6drjQeRRlPLXwrbihVw8cVhwfTPP4cnnoCRI5X4RdaRWv5SuKZODROxffABnH12WEt3iy1iRyVSEtTyl8Lz5Zch2XfpAmbhQ+C++5T4RbJIyV8KyxNPhGKtQYNCd8/s2WH8vohklZK/FIaKCjj1VOjRA7beGl5+GW65BTbdNHZkIiUpavI3s/XN7HUzGx8zDonIPQzZ3HtvGDMmDOUsL4fvfz92ZCIlLfYN398AbwGbR45DYliwICyyMmFCmHVz8OAw976I5Fy0lr+Z7QgcAwyKFYNEsmpVuIG7zz7hZu4dd8C0aUr8InkUs+XfH7gUaFbXDmbWD+gH0Lp16/xEJbnz9tuhdT9sWOjj79o1rLC1666xIxNJnCgtfzM7Fljs7q+tbT93H+juZe5e1kJFPcVp+XIYOhQ6dgz9+v37Q4cOoatn8mQlfpFIYrX8DwV6mFl3YGNgczMb4e6nR4pHssk9zLQ5aFC4mfvVV7DHHqFI6+c/h223jR2hSOJFSf7ufjlwOYCZdQYuVuIvAZ99FqZeGDQojM/fZBM48cRQpduhQyjYEpGCEHu0jxS7qip49tnQlz9mDHz7LXzve2Ga5VNPVVWuSIGKnvzd/Vng2chhSEP961+hL3/w4DD3TvPmoYXfty8ceGDs6ESkHtGTvxSRykqYODF060ycGFr9nTvDn/4EJ5wQunlEpCgo+Uv93n0XhgwJLf1PPw03bC+7DM48U8smihQpJX+p3cqVoQ9/0CB47jlYf33o3j107XTvDk30oyNSzPQbLP/r9ddDwh85MkytvNtucMMN0Ls3bL997OhEJEuU/AW++AJGjQpJf+ZM2Ggj6NUr3Lzt1AnW0+SvIqVGyT+p3OGFF0LCf+wx+OYbaNcO7roLTjsNttwydoQikkNK/knz6adhbp3Bg8ON3M03hz59Ql9++/YqxBJJCCX/JFi9Gp56KrTyn3gC1qwJc+1ceWXo3tlss9gRikieKfmXsg8+CEM0H3wwFGW1bAkXXhiGaO61V+zoRCQiJf9S88038PjjoVtnypRws7ZbN7j7bjj2WNhgg9gRikgBUPIvFbNnh4T/0EPw+efQpk2ovO3TB3baKXZ0IlJglPyL2bJl8MgjoS//1Vdhww3h+OPDzdsuXTREU0TqpORfbNxh+vSQ8B99FFasCMsf9u8Pp58OW28dO0IRKQJK/sWiogKGDw9J/+23oWlT+OlPQyv/oIM0RFNEGkTJv5CtWROWOhw8GP7+9zCr5iGHhOcnnRQ+AEREGkHJvxB9/HEYnjlkCCxYELpyzjsvTLfQtm3s6ESkBCj5F4pvv4Vx40K3zuTJ4bUjjoDbboMePcJ8OyIiWaLkH9u8eaEbZ/hwWLIkDMu86io44wzYeefY0YlIiYqS/M1sJ2A40ApwYKC7D4gRSxRffw1//Wto5U+fHgqvevYM3TpHHBHmzhcRyaFYLf/VwEXuPtPMmgGvmdlkd58XKZ7cc4cZM0LCf+SR8AGw115w663ws5+FqRdERPIkSvJ390XAotTjr8zsLWAHoPSS/9KlMGJESPpz5sCmm8LJJ4chmoccoiGaIhJF9D5/M2sDHAi8Usu2fkA/gNatW+c3sHVRVQXPPBMS/uOPw6pVYSz+/ffDKaeEaZRFRCKKmvzNrCkwBrjA3ZfV3O7uA4GBAGVlZZ7n8Bpu4cL/DtH86KOwIMo554S+/P33jx2diMh/REv+ZrYBIfGPdPexseJYZ5WVMH58aOVPmhRa/V27wp//DMcdBxtvHDtCEZHviDXax4DBwFvufnuMGNbZP/8ZhmgOGwaLF4fFzS+/PMyVv+uusaMTEVmrWC3/Q4GfAW+a2Rup165w94mR4snMihVhvdvBg8P6t+uvDz/+cbh5e9RR0CT6LRQRkYzEGu3zIlAcw1zcYebM0K3z8MNhGuXdd4cbb4TevWHbbWNHKCLSYGqq1uXzz2HkyJD0Z80Kffcnnhha+R07aoimiBQ1Jf90VVXw3HOhW2f06DDfTvv28Je/hOmTmzePHaGISFYo+UNY3HzYsJD0338fttgiDM/s2zckfxGREpPc5L96NUycGLp1Jk4Mc+d36gRXXw0nnBAqcUVESlTykv9774UirKFDYdEiaNUKLr44DNHcY4/Y0YmI5EUykv/KlTB2bGjlP/tsWNi8e/dw87Z79zCrpohIgpR+8r/6ahgwAL74IhRfXX99GKK5ww6xIxMRiab0k/+aNXD00aGV37lzaPWLiCRc6Sf/a6+NHYGISMFRM1hEJIGU/EVEEkjJX0QkgZT8RUQSSMlfRCSBlPxFRBJIyV9EJIGU/EVEEsjcPXYMGTGzCuDjRr59G2BJFsOJSedSeErlPEDnUojW9Tx2dvcWNV8smuS/Lsys3N3LYseRDTqXwlMq5wE6l0KUq/NQt4+ISAIp+YuIJFBSkv/A2AFkkc6l8JTKeYDOpRDl5DwS0ecvIiL/KyktfxERSaPkLyKSQCWT/M1siJktNrM5dWw3M7vTzN4zs9lm1j7fMWYqg3PpbGZfmtkbqa+r8h1jJsxsJzObambzzGyumf2mln2K4rpkeC7Fcl02NrMZZjYrdS7X1LLPRmb2aOq6vGJmbSKEWq8Mz6WPmVWkXZdfxIg1E2a2vpm9bmbja9mW3Wvi7iXxBRwGtAfm1LG9O/AkYMDBwCuxY16Hc+kMjI8dZwbnsR3QPvW4GfAO0LYYr0uG51Is18WApqnHGwCvAAfX2OdXwH2px6cAj8aOex3OpQ9wd+xYMzyfC4GHa/s5yvY1KZmWv7s/D3y2ll16AsM9eBlobmbb5Se6hsngXIqCuy9y95mpx18BbwE71NitKK5LhudSFFL/11+nnm6Q+qo58qMnMCz1eDTQ1cwsTyFmLMNzKQpmtiNwDDCojl2yek1KJvlnYAdgQdrzhRTpL2/KIak/dZ80s31iB1Of1J+oBxJaZumK7rqs5VygSK5LqnvhDWAxMNnd67wu7r4a+BLYOq9BZiiDcwH4SapbcbSZ7ZTfCDPWH7gUqKpje1avSZKSfymZSZivox1wF/C3uOGsnZk1BcYAF7j7stjxrIt6zqVorou7r3H3A4AdgYPMbN/IITVaBufyBNDG3fcHJvPf1nPBMLNjgcXu/lq+jpmk5P8JkP6Jv2PqtaLj7suq/9R194nABma2TeSwamVmGxCS5Uh3H1vLLkVzXeo7l2K6LtXc/QtgKtCtxqb/XBczawJsASzNa3ANVNe5uPtSd/829XQQ8L08h5aJQ4EeZvYR8AjQxcxG1Ngnq9ckScl/HPDz1OiSg4Ev3X1R7KAaw8y2re7rM7ODCNex4H4xUzEOBt5y99vr2K0orksm51JE16WFmTVPPd4EOAJ4u8Zu44Deqce9gGc8daexkGRyLjXuIfUg3K8pKO5+ubvv6O5tCDdzn3H302vsltVr0qSxbyw0ZjaKMNpiGzNbCPyRcPMHd78PmEgYWfIesAI4I06k9cvgXHoBvzSz1cBK4JRC/MUktGZ+BryZ6pMFuAJoDUV3XTI5l2K5LtsBw8xsfcIH1F/dfbyZ/Qkod/dxhA+6h8zsPcLgg1PihbtWmZzL+WbWA1hNOJc+0aJtoFxeE03vICKSQEnq9hERkRQlfxGRBFLyFxFJICV/EZEEUvIXEUkgJX8RkQRS8pfESk3T/KGZbZV6vmXqeZta9m1jdUyxnbZP59qm4q3nPc+aWVmDAhfJAiV/SSx3XwDcC9yYeulGYKC7fxQtKJE8UfKXpLsDONjMLgA6ALfW94bUXwEvmNnM1NcP0zZvbmYTzOyfZnafma2Xes+RZjY9tf9jqQniRKJR8pdEc/dK4BLCh8AFqef1WQwc4e7tgZOBO9O2HQScB7QFdgNOSE3u9nvg8NR7ygmLdohEUzJz+4isg6OBRcC+hCl/67MBcLeZHQCsAfZI2zbD3T+A/8zR1AH4hvBhMC0179uGwPRsBS/SGEr+kmipBH4EYQnJF83skQxmFf0t8G+gHeGv52/SttWcLMsJSw1OdvdTsxK0SBao20cSKzX98r2E7p75wC1k0OdPmEd9kbtXEWb6XD9t20Fmtkuqr/9k4EXgZeBQM9s9ddzNzGyPmt9UJJ+U/CXJzgLmu3t1V889wN5m1qme990D9DazWcBewPK0ba8CdxPmjP8QeNzdKwjTCI8ys9mELp+9snYWIo2gKZ1FRBJILX8RkQTSDV+RNGa2H/BQjZe/dfcfxIhHJFfU7SMikkDq9hERSSAlfxGRBFLyFxFJICV/EZEE+n80t6BR/x2PtQAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "x_axis = [1, 2, 3, 4]\n",
    "y_axis = [1, 4, 9, 16]\n",
    "\n",
    "plt.plot(x_axis,y_axis,color=\"r\") # change the color by color param\n",
    "plt.xlabel(\"X_label\") # add x axis title\n",
    "plt.ylabel(\"Y_lable\") # add y axis title\n",
    "plt.title(\"line_chart\") # add picture title\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 203,
   "id": "6f94b3d7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD4CAYAAAAXUaZHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXYElEQVR4nO3de5SddX3v8fcXAgkXNRhiEkEIKsVVWUuBKfUUhSOgJxoWoUfaRT1yqKWNiiLKWUvBUj1nTT2KPdriEq0IBLqkSgUkVll4MGJjFillINFyOZU0cglOYCig4RKGJN/zx7MnsyczO5l9mX159vu11qyZ/Vz2/j0kfPKd3/N7fr/ITCRJ5bJXpxsgSWo9w12SSshwl6QSMtwlqYQMd0kqoVmdbgDAwQcfnIsXL+50MySpp9x9991PZub8qfZ1RbgvXryYoaGhTjdDknpKRDxca5/dMpJUQoa7JJWQ4S5JJWS4S1IJGe6SVEKGuyR1yPCWYU665iQ2P7u55e9tuEtShwyuHmTNI2sY/KfBlr+34S5JHTC8ZZgV61ewI3ewYv2KllfvhrskdcDg6kF25A4Atuf2llfvhrsktdlY1T66fRSA0e2jLa/eDXdJarPqqn1Mq6t3w12S2mztprU7q/Yxo9tHuWPTHS37jK6YOEyS+sm6D6yb8c+wcpekEjLcJamEDHdJKiHDXZJKyHCXpBIy3CWphAx3SSohw12SSshwl6QSMtwlqYQMd0kqoT2Ge0RcHRFPRMS9VdteGRG3RcSDle8HVbZHRHw5IjZExM8j4tiZbLwkaWrTqdyvAZbssu0iYFVmHgmsqrwGeBdwZOVrOfC11jRTklSPPYZ7Zq4Gntpl8zLg2srP1wJnVG3/uyz8MzA3Iha1qK2SpGlqtM99QWYOV37eDCyo/HwI8GjVcZsq2yaJiOURMRQRQyMjIw02Q5I0laZvqGZmAtnAeVdk5kBmDsyfP7/ZZkiSqjQa7o+PdbdUvj9R2f4Y8Jqq4w6tbJOkUhveMsxJ15zU0nVQm9FouH8POKfy8znAyqrt/70yauYtwK+rum8kqbQGVw+y5pE1LV0HtRnTGQr5LWAtcFREbIqIc4HPA++IiAeBUyuvAW4BNgIbgG8A581IqyWpiwxvGWbF+hXsyB2sWL+iK6r3Pa6hmpl/VGPXKVMcm8CHm22UJPWSwdWD7MgdAGzP7Qz+0yCXL728o23yCVVJasJY1T66fRSA0e2jXVG9G+6S1ITqqn3MWPXeSYa7JDVh7aa1O6v2MaPbR7lj0x0dalFhj33ukqTa1n1gXaebMCUrd0kqIcNdkkrIcJekEjLcJamEDHdJKiHDXZJKyHCXpBIy3CWphAx3SSohw12SSshwl6QSMtwlqYQMd0kqIcNdkkrIcJekXQxvGeaka07q+GpKzTDcJWkXg6sHWfPImo6vptQMw12Sqoytibojd3TFWqiNMtwlqUr1mqjdsBZqowx3SaoYq9rH1kQd3T7as9W74S5JFdVV+5herd4Nd0mqWLtp7c6qfczo9lHu2HRHh1rUuFmdboAkdYt1H1jX6Sa0jJW7JJVQU+EeER+PiPsi4t6I+FZEzImIIyLizojYEBHXR8S+rWqsJGl6Gg73iDgE+CgwkJlHA3sDZwGXAn+dma8HngbObUVDJUnT12y3zCxgv4iYBewPDAMnAzdU9l8LnNHkZ0iS6tRwuGfmY8D/AR6hCPVfA3cDz2Tmtsphm4BDpjo/IpZHxFBEDI2MjDTaDEnSFJrpljkIWAYcAbwaOABYMt3zM/OKzBzIzIH58+c32gxJ0hSa6ZY5FfhlZo5k5kvATcAJwNxKNw3AocBjTbZRklSnZsL9EeAtEbF/RARwCnA/cDtwZuWYc4CVzTVRklSvZvrc76S4cXoP8K+V97oC+CRwYURsAOYBV7WgnZKkOjT1hGpmfgb4zC6bNwLHN/O+kqTm+ISqJJWQ4S6p9MqwbF69DHdJpVeGZfPqZbhLKrWyLJtXL8NdUqmVZdm8ehnukkqrTMvm1ctwl1RaZVo2r16Gu6TSKtOyefVymT1JpVWmZfPqZeUuSSVkuEtSCRnuklRChrsklZDhLkklZLhLUgkZ7pJUQoa7JJWQ4S5JJWS4S1IJGe6SVEKGu6Se0o9L5jXCcJfUU/pxybxGGO6Seka/LpnXCMNdUs/o1yXzGmG4S+oJ/bxkXiMMd0k9oZ+XzGuE4S6pJ/TzknmNaGqZvYiYC1wJHA0k8CfAvwHXA4uBh4A/zMynm/kcSernJfMa0Wzlfhlwa2a+AXgT8ABwEbAqM48EVlVeS5LaqOFwj4hXACcCVwFk5mhmPgMsA66tHHYtcEZzTZQk1auZyv0IYARYERHrIuLKiDgAWJCZw5VjNgMLpjo5IpZHxFBEDI2MjDTRDEnSrpoJ91nAscDXMvMY4Dl26YLJzKToi58kM6/IzIHMHJg/f34TzZAk7aqZcN8EbMrMOyuvb6AI+8cjYhFA5fsTzTVRklSvhsM9MzcDj0bEUZVNpwD3A98DzqlsOwdY2VQLJUl1a2ooJHA+cF1E7AtsBN5P8Q/GP0TEucDDwB82+RmSpDo1Fe6ZuR4YmGLXKc28rySpOT6hKqljnJt95hjukjrGudlnjuEuqSOcm31mGe6SOsK52WeW4S6p7ZybfeYZ7pLazrnZZ57hLqntnJt95jX7EJMk1c252WeelbsklZDhLkklZLhLUgkZ7pJUQoa7JJWQ4S5JJWS4S1IJGe6SWsYpfLuH4S6pZZzCt3sY7pJawil8u4vhLqklnMK3uxjukprmFL7dx3CX1DSn8O0+hrukpjmFb/dxyl9JTXMK3+5j5S5JJWS4S1IJGe6SVEJNh3tE7B0R6yLi+5XXR0TEnRGxISKuj4h9m2+mJKkerajcLwAeqHp9KfDXmfl64Gng3BZ8hiSpDk2Fe0QcCiwFrqy8DuBk4IbKIdcCZzTzGZI6w0nAeluzlfvfAJ8Axp5emAc8k5nbKq83AYdMdWJELI+IoYgYGhkZabIZklrNScB6W8PhHhGnAU9k5t2NnJ+ZV2TmQGYOzJ8/v9FmSJoBTgLW+5qp3E8ATo+Ih4BvU3THXAbMjYixh6MOBR5rqoWS2s5JwHpfw+GemRdn5qGZuRg4C/hxZv434HbgzMph5wArm26lpLZxErBymIlx7p8ELoyIDRR98FfNwGdImiFOAlYOLZlbJjN/Avyk8vNG4PhWvK+k9nMSsHJw4jBJEzgJWDk4/YAklZDhLkklZLhLUgkZ7lIfcCqB/mO4S33AqQT6j+EulZxTCfQnw10qOacS6E+Gu1RiTiXQvwx3qcScSqB/Ge5SiTmVQP9y+gGpxJxKoH9ZuUtSCRnuklRChrsklZDhLvUgpxPQnhjuUg9yOgHtieEu9RinE9B0GO5Sj3E6AU2H4S71EKcT0HQZ7lIPcToBTZfhLvUQpxPQdDn9gNRDnE6gHBYuhMcfn7x9wQLY3KIeNit3SWqzqYJ9d9sbYbhLHeYDSZoJhrvUYT6Q1NsWLoSIyV8LF3a2XYa71EE+kNT72tHF0oiGwz0iXhMRt0fE/RFxX0RcUNn+yoi4LSIerHw/qHXNlcrFB5I0U5qp3LcB/yMzfxt4C/DhiPht4CJgVWYeCayqvJa0Cx9I6j7t6mJZsKC+7Y1oONwzczgz76n8vAV4ADgEWAZcWznsWuCMJtsolZIPJHWfdnWxbN4MmZO/WjUMElrU5x4Ri4FjgDuBBZk5XNm1GZjy36KIWB4RQxExNDIy0opmSD3FB5I0k5p+iCkiDgRuBD6Wmb+JiJ37MjMjIqc6LzOvAK4AGBgYmPIYqcx8IGnmteNhoQULan9GJzVVuUfEPhTBfl1m3lTZ/HhELKrsXwQ80VwTpd7hmPXu0o5ulnZ0sTSimdEyAVwFPJCZX6ra9T3gnMrP5wArG2+e1Fscs65u0UzlfgJwNnByRKyvfL0b+Dzwjoh4EDi18loqPcesz6x2jGRpxyiWdmm4zz0z1wBRY/cpjb6v1KumGrN++dLLO9yq8mhXF0tZ+ISq1AKOWVe3MdylFnDMen3K9LBQtzLcpRrqGfnimPX6lOlhoW7lYh1SDdUjX/bUd+6YdXUbK3dpCo58qY8jWbqP4S5Nwdka69PPDwt1K8Nd2kW/j3zp1sUnVB/DXdpFv498aUcVbhfLzDPc1TemO/rFkS8zzy6WmedoGfWN6Y5+KdPIl3bMiqjuZOWuvtCvo1/aNZ7cbpbuY7irL5Rl9Eu33uy0m6X7GO4qvTKNfvFmp6bLcFdPqmdqgG4d/WIVrplkuKsn1bMoRreOfmlXf7j6U2R2fvnSgYGBHBoa6nQz1COGtwzz2i+/lq3btrLfrP3YeMFGFh7Y2XK3kVEpUWs1BIpKuRXnOFqm3CLi7swcmGqflbs6rt51R7vx5mi3VuF2sfQvw10dV08XSztujnZrXzh4s1PTZ7iro+odf97IzdF6w7pbq3CwEtf0Ge7qqHq7WBq5OdqtYW0VrpnkDVW13PCWYc668SyuP/P63d7orL4xOmZPN0jbceOyHTc6VSKZsG0bvPQS7L9/sW3zZnj6aXjxxfGvvfeGE04o9v/oR/DII3D44XDKKQ1/9O5uqDq3jFpuunO4vO5PBtn6hh0T/ha+sHU7r33/IM9/Z+rzurkKr/WPjmZA5sTgHPs6/HDYZx949FF48MGJ+7ZuhbPOgtmzYfVq+OlPi23Vx3zlKzBrFlxxBaxcOXE/wF13Fd/PPx+++c3x982EefPgySeL/eedB9/97sQ2H344PPRQ8fMXvgC33QZ/8AdNhfvuGO7arelW4VCpqp8bhgtWwD47+OraFXz1rL9gwQELp6yqX5i3FmZN7GJh1igvzOu92Rf7os9727bi+6xZRaj96lcTg/PFF+Goo8b/pVu1anJ4nnkmvO518POfw9e/Pjl8P/c5eOMb4ZZb4JJLJr73iy/C7bfD0UfDV78KH/nI5DZu2FC8/9//PVx00eT9S5YU7bvtNvjLvyy2zZlTBP7s2fClLxXX98wzxTWM7Xv5y2G//cbf5/jji1/Xxs6bMwde9rLx/RdcUAR39XtX77/mmuK/Z/W2FrNbRlPa2f2x9Dw47usw9EG45fI9d38sPQ+OuaoI7W37wj1/Crdc3tHuj54dG54Jo6MTw23OHJg/v9h3xx2TK9fXvx6OO64478tfnhyO73wnLF0KTz0Ff/Znk/effz6cfTb8+7/DW986cf+OHUUgL18OQ0PwO78zuc3XXQfvfS/85Cfw9rdP3r9yJZx+Otx6K7zvfePBNxaQ3/gGDAwU53/xi+Pbx4656CI47DC45x744Q8n7ps9u3jvl78cHn64qJKr982eXVTPs2YV/32gqPJ39xeky+2uW8Zw7wM7w+rAYTjzLLjhenh24Z6D+sBhuOC1sM9WeGk/uGwjPLuwdoi+rOr4MZXzcsvkqr9d4V5XWG/fXvSNQvEr9pYtEwNun33g2GOL/T/+cfEG1ftf9aoi3AAuvbToV60O36OPhk9/uti/bFnRfVB9/rveVQQcwEEHFRVktfe/H66+uvh5772LwK320Y/CZZfBCy+M9/9WV5if+AR86lNFuJ944ng4jn0/91x4z3tgZAT+/M8nh+vSpcX1/8d/wD/+48Rgnj27qLoXLoTnn4dNmyaH7+zZsJfjOFrFPveSqTesdwbbSYNw2Bo4cRBuuXzP/dQnDUJUwiO27zxvWsePGTuP3ZxXy6ZN4+E4Fn6cUfv4T36y+FX3i18sXn/ta/CjH7F5oOr8OXOKX8kBPvQhWHTzxMr10EOLqg+KkB47dswb3wj33lv8fMklsHbtxP3HHz8e7jffDL/4xcSAmzt3/Ni5c4twrg7H444b33/xxcX1VO9/wxvG9996K+y778TgPPjgYt+cOfCb3xTbpqpOX/nK8euYyvz5Rb9zLfPmwR//ce39++8Pv/Vbtfdrxlm5t1Ajv843XFXDpC4T2M3oj1pV+K+GJ/eLHnkkcdjW2lX4A9uKEFy3rvg1vHJ+7D0Ei9ZPbsDwm8mfjo6//+rVcMQRe67CP/1pGJw4NHIhwzzO5N8CFrCZzbMXF6H29NPFRV9ySRGw1eE4dy7cdFNx0t/+bfHrfXX4zpsHH/94sf/WW4s/hOrwPOgg+L3fK/Zv3FiMkKh+/zlzJvbNSjOo7d0yEbEEuAzYG7gyMz+/u+PrDfdGArEd5+wMq12OB8itVeG5777wilfA9u3ErEoXwFRB/exzsGLFpH7R+MKltcP6bScW4fT7vw/33QennUY89MvafeFMkbA330x844fjx4+pnJcX/tfiDv93vlN0E1SCL4Z/VfPPLN9z5nhAfvazsGgRC+eN8vhT+046dsGrdrD58b2KqveXv5zcNXDUUUWXxHPPjVe+Pd53KjWireEeEXsDvwDeAWwC7gL+KDPvr3VOveFeV+X661/D888Tr15U+5wbbpxYuS5aBMuWTe9zTj+96Jt98UXinrtrH18douefX9zs2rqV2G9O7aAeebL49bja7NnEi1trh/V/fjt87GPj/bmXXELc9PnaVfhf3Tw5PAcGiM8sqV2Ff+Wu4qbULrrmRqTUJ9od7v8J+J+Z+V8qry8GyMzP1TqnoXCvFYgHHFgMhfrZz4qDTzgB7riDIGufs2v1evLJsGrV9G4qLltW3DyaPZv4wfdrH//Z/z0enm96E7ztbZBJ7BW1g3r7juLG1Vj4VqrTum9cnnZe7Sr8B1P3hRvUUvdr9w3VQ4BHq15vAn53ikYtB5YDHHbYYfV/Sq2bfR/8ILz61ePHXXhhMbTrQ7s5Z/36iZXrAQfs+XPGrFxZdVG7Of5Tn5p8DRHFPwZvXjEevLNG4ZgVsPovYK+Fkyv3Xdu0871q37ictXgt26YYTz7riNrjyQ1wqbfNROV+JrAkM/+08vps4Hczc4onDgp1V+51Vq7tOqehz2igqt7nI8ewbf76SdtnjbyZl76ybspzJJVPuyv3x4DXVL0+tLKtdRoZcteGc/ZbMsgLUxy/35Lan9FIVW2AS9qTmQj3u4AjI+IIilA/C3hvKz+gkUBsxzlHnbqW9ZsnH3/UqQa1pPaaqaGQ7wb+hmIo5NWZ+dndHV+Wce6S1E5tf0I1M28BbpmJ95Yk7ZmTPEhSCRnuklRChrsklZDhLkkl1BWzQkbECPBwg6cfDDzZwub0mn6+/n6+dujv6/faC4dn5hSPsXdJuDcjIoZqDQXqB/18/f187dDf1++17/na7ZaRpBIy3CWphMoQ7rtZC6wv9PP19/O1Q39fv9e+Bz3f5y5JmqwMlbskaReGuySVUE+He0QsiYh/i4gNEXFRp9vTThFxdUQ8ERH3drot7RYRr4mI2yPi/oi4LyIu6HSb2iUi5kTEv0TEzyrX/r863aZ2i4i9I2JdRHy/021pt4h4KCL+NSLWR8Rup9Lt2T73RhbiLpOIOBF4Fvi7zDy60+1pp4hYBCzKzHsi4mXA3cAZ/fBnHxEBHJCZz0bEPsAa4ILM/OcON61tIuJCYAB4eWae1un2tFNEPAQMZOYeH+Dq5cr9eGBDZm7MzFHg28CyDrepbTJzNfBUp9vRCZk5nJn3VH7eAjxAsXZv6WXh2crLfSpfvVmhNSAiDgWWAld2ui3drpfDfaqFuPvif3CNi4jFwDHAnR1uSttUuiXWA08At2Vm31w7xSJAnwB27OG4skrg/0bE3RGxfHcH9nK4q89FxIHAjcDHMvM3nW5Pu2Tm9sx8M8X6xMdHRF90y0XEacATmXl3p9vSQW/NzGOBdwEfrnTPTqmXw33mF+JW16r0N98IXJeZN3W6PZ2Qmc8AtwNLOtyUdjkBOL3S7/xt4OSI+GZnm9RemflY5fsTwHcpuqen1MvhvnMh7ojYl2Ih7u91uE1qg8pNxauABzLzS51uTztFxPyImFv5eT+KAQX/r6ONapPMvDgzD83MxRT/v/84M9/X4Wa1TUQcUBlAQEQcALwTqDlarmfDPTO3AR8BfkhxQ+0fMvO+zraqfSLiW8Ba4KiI2BQR53a6TW10AnA2ReW2vvL17k43qk0WAbdHxM8pCpzbMrPvhgT2qQXAmoj4GfAvwA8y89ZaB/fsUEhJUm09W7lLkmoz3CWphAx3SSohw12SSshwl6QSMtwlqYQMd0kqof8PuqRhysvbEXsAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# make the data\n",
    "t = np.arange(0., 5., 0.2)\n",
    "\n",
    "# draw three different with red dashes, blue squares and green triangles\n",
    "plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 204,
   "id": "2adceea8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAEICAYAAABCnX+uAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAA6OklEQVR4nO3dd3xVRfrH8c+TRgghtCA1GAKIi4pSVFAB6wIKtp8Fu66I7IqgLogCCigWxLKiWFCwoqCrriBYUXAXRQQEUYpSJXRCSyCVzO+PCYJIIOXczDn3Pu/XKy9T7j3ne+Pw5NyZOTNijEEppVRwRbkOoJRSqny0kCulVMBpIVdKqYDTQq6UUgGnhVwppQJOC7lSSgWcFvIKJCI3isj/XOdQykvart3TQh5gInKmiKQf4TFnichXIrJTRFZXUDSlyqyE7XqAiPwkIpkiskpEBlRUPj/SQh5QIhJTwofuBsYDEd3QVTCUol0LcD1QA+gC9BGRHiEL5nNayENARFJE5H0R2SIiGSLy7EE/f1xEthddSXQ94Ps3iciSoquMlSJy6wE/O1NE0kVkoIhsBN4GPgbqi0hW0Uf9g7MYY+YYY94AVobuFatI4LN2/ZgxZr4xpsAYswz4EDg9ZC/e57SQe0xEooGPgDVAKtAAmHjAQ04FlgHJwGPAOBGRop9tBroBScBNwFMi0vqA59YFagJHY69GugLrjTGJRR/rQ/W6VGTzc7suOk8H4OfyvMYg00LuvVOA+sAAY8xuY0yOMebAgaA1xpiXjDF7gdeAekAdAGPMVGPMCmPNBD7DNtB9CoGhxphcY0x2xbwcpQB/t+th2Fr2ShmeGxa0kHsvBduoC4r5+cZ9nxhj9hR9mgggIl1FZLaIbBORHcD52CucfbYYY3JCkFmpI/FluxaRPtir+AuMMbllOUY40ELuvbVAo1IM2gAgIpWA94DHgTrGmOrANOygzj4HL1WpS1eqiuK7di0ifwPuAc4xxhx2lku400LuvTnABuBREakiIvEiUpJBmDigErAFKCgaLPrrEZ6zCaglItWKe4CIRIlIPBBrv5R4EYkr0StRaj+/tetrgIeB84wxET+Qr4XcY0V9hN2BpsBvQDpwZQmelwn0Bd4BtgNXA5OP8Jyl2FH+lSKy41Cj+0BHIBt7FdSo6PPPSvp6lAJftusRQC3g+wNmt7xQipcUVkQ3llBKqWDTK3KllAo4LeRKKRVwWsiVUirgtJArpVTAlWpOqFeSk5NNamqqi1OrCDBv3rytxpjaLs6tbVuFUnFt20khT01NZe7cuS5OrSKAiKxxdW5t2yqUimvbnnStiMh4EdksIj95cTyl/EDbtQoKr/rIX8WuCaxUOHkVbdcqADwp5MaYr4FtXhzLD9avh7FjoXt3OP54eOQR14mUC+HWrikshDlzYMgQaNMGzjwTvv3WdSrlgQrrIxeRXkAvgEaNGlXUaUvt6qvh7bft56mp0LQpbN9uv967F3r1grPPhq5doWZNZzGVjwSiba9eDe3bw8aNEBUFp58O27ZBfLz9+VdfwbRp9urltNMgxsnwmSqjCpt+aIwZa4xpa4xpW7u2kwkFf2IMPPoonHAC5Ofb73XqBA8/DIsWwcqV8Pnn8Nhj9mdr1sDUqXDttXDUUXDWWfZxKrL5sW2zYQNceikMKNrhr1EjOP98eOMN2LwZvv4afvwRWrWyP//hB3j6afsPoE4d6NsXcnTF5KCI2HnkeXlw/fVw771Qv769OAG49Vb7veOPB5E/PictzXa7zJ4N99wDS5fCGWfAr79WfH6lirV4MbRrB599BjVq2O9FRcG4cfYqpFatPz/nrrtg61Z4913o0gWeeQYuu6xic6syi8j3Tzt22IuVr76Chx6yhfvgol2cqCg49VT70bs3vPSS7X5RyhdmzICLL4bKle1Vd+vWR3rGfklJtnhfdhlceCFUrx6ikMprXk0/fBv4FmhetJHqzV4cN1T++U/43//su8xBg0pexA/WsCEMH26fv2IFPP647a5R4SFo7ZodO2wRr1/fvm0sTRE/2JVXQufO9vOXXwadG+9rnlyRG2Ou8uI4FWXkSLjhBujY0btjvvKKvbpfsgReeAFiY707tnIjaO2a6tXhgw/gpJP2d6mUV3a2nba1cSO88w5ccIE3x1Weipg+8k8/tRcreXmQnOxtEQd48EG4/34YPx66dYNdu7w9vlKHVFBg+/jGj7dfn3WWd0UcbBfNrFnwl7/Y7pYXX/Tu2MozEVHIx4+3FxKrV9t3n6EgYrtZxo2D6dPtH4r160NzLqUA2L3bXp28+CKsWhW689Sta/veu3a1fzTuvVf7EH0m7Av5J5/AzTfDOefAf/9rpw2G0t/+ZqcopqVBdHRoz6UimDHQsyd8/LHty3vwwdCeLzER/vMfuO02+5ZWC7mvhPWslY0bbV/48cfbNli5csWct3Pn/eNExpR9MFWpYr3yCkycaAdmbr21Ys4ZEwPPPrv/a23cvhHWV+SbN9spsxMnVlwRP9CWLfYu0I8/rvhzqzC3bRv89a8wcKCb80+bZueqZ2a6Ob/6g7Au5C1bwk8/wXHHuTl/1ar239v112t/ufJY//72CsFV/11iop2SeNttbs6v/iAsC/m339ob1XJz7Q08rsTH23cDe/bAddfZtVqUKpchQ/a/xXPZuDt2tNO03ngDXn/dXQ4FhGEh37EDrrrKTqf1w1IRf/mLvdv5yy/t/HWlyuz9922f+PTprpNYQ4bYtVn+8Q/45RfXaSJaWBVyY+zqhOvW2SvhatVcJ7Juugl69IDXXvPHHxcVQGvW2OlXbdvaVd38IDoa3nwTKlXav2SociKsZq289JJd8+fRR+1aKH4hYqf6GrN/1VClSqygAK65xvbNTZwIcXGuE+3XsCEsXAgNGrhOEtHC5oo8K8vep3DeeftX7vSTpCT7DiEnx940pNNwVYm98469u/KFF6BJE9dp/qxhQ3u1smSJ7UNUFS5srsgTE+1ib7VquR0DOpK33rL3ceTlwd//7jqNCoSrrrJ3V559tuskxdvXr7lkCSxYYIu7qjA+Lnklt2KF/e9xx9n27mc33miXex4wwPblK1WsnBxIT7dXu34u4mAzjhtnF9nq29d1mogT+EK+cqUt4I8/7jpJyURFwXPP2W5PV/dyqIB44glo3hzWrnWdpGSOOQYGD7ZTxvwysyZCBL6QDxhgB8979HCdpOQaN7b3c0yYAN984zqN8qV16+zslM6dISXFdZqSu+su28DvuMNeragKEeg+8i+/tFNrR4wIXpfcvffaLeKSklwnUb40cKCdpfLEE66TlE58PDz1lL1Cyc/XTZwrSGB/ywUF0K+f/eP/z3+6TlN6VarApEmuUyhf+uYb+3Zt8GDbwIPmoovsh6owge1a+eUX2LTJXrAEeW72hg12uYqdO10nUb7x9de2O+Xee10nKZ/PPrM3daiQC2whb9HCzla5+GLXScpn/Xp4/nl44AHXSZRv3HMP/PyzfdsWZFOm2HcVCxe6ThL2AlnIZ860XStVqwZ/OeQ2bexmFKNHw7JlrtMop3bu3L/JcdWqbrN4Yfhwu+3cHXfoHXAhFrhCvnChnVIbTu/YHn4YEhLgzjtdJ1FOPfCAXeP7t99cJ/FGzZp256IZM+C991ynCWuBKuTG2AHOGjXsgmvh4qijYOhQuzrp1Kmu0ygnli61b8tuvBEaNXKdxju9etmNAf75T3uzkAqJQM1aee89263y3HP2j3046dPH7mjUurXrJMqJu+6yb8v8srKhV6Kj7fZwixf7a7GvMBOYQp6dbW+iadnS/pEPN3Fx4dVdpEph6lT7duyJJ0K/O7gLHTrYDxUygelaWbfOjv+MHh3eu9MvXQpdu9qplSpCbNhgR7379HGdJLTGjw//1+hIYAp506Z2oLNTJ9dJQksEPv88/N5hq8Po2RPmzAn/roeVK2HMGPjhB9dJwk4gCvknn9gt3Py8PK1Xmje3410vvGA3hVFhLDsbJk+2o/iR0Lj797czFYYMcZ0k7Pi+9axbB5dcYu+RiBRDh9r/6k1CYe755+2t7HPmuE5SMapXt2vITJsG//uf6zRhxfeF/MEH7dpBkbTka0qKnV756qt6k1DY2rXL9p/99a/+2pcw1G6/3W4aMGiQ3iTkIV/PWlm+3K5Vf+utwVw7qDwGDYI6dXQrxLD11FOQkQEPPeQ6ScVKSLD95ImJrpOEFV8X8mHDIDbWLtcQaWrXjqzupIiydaudanjppdC2res0Fe/SS10nCDu+7VopKLBLT/TrB/XquU7jzocf6oytsLNihd1c9sEHXSdxZ88euyvMf/7jOklY8O0VeUyMXTxt717XSdxatsy+E+3RA844w3Ua5YlTT7X9huF8Q8SRVKpkb4T66CPo1k03oCgnX16RL11qp5xCZLd1sFfjdevapal1bCgMTJ9uN1WO9IYdHW239lq6FN5803WawPNlIb/tNujYUbf8Azs2dN99drbWp5+6TqPKZflyuwenziu1LrnEjhEMGwa5ua7TBJrvCvn06XYvzgED9N3WPj172lk7gwZBYaHrNKrMhg61XQp9+7pO4g8idgrmmjUwdqzrNIHmSSEXkS4iskxElotImedaGGOLVUoK9O7tRbLwEBcH//pXMPcmDTqv2jY//ghvv21H7+vW9TBhwJ17Ltx/P5xzjuskgVbua14RiQbGAOcB6cD3IjLZGLO4tMf68EN7k9u4cfbCRe134YWuE0QeL9s2Q4ZAUpJ9q6n2E7E7Caly8eKK/BRguTFmpTEmD5gIlGkL7WXL4Pjj4frrPUgVhgoK4LHHYOJE10kihjdte/duu5zl3XfbtUbUn61aZRcZ2r7ddZJA8qKQNwDWHvB1etH3/kBEeonIXBGZu2XLlkMeaOBAmD9f+8aLEx0N//63rQc6NlQhvGnbVarA7Nl6NX44mZnw+uswapTrJIFUYYOdxpixxpi2xpi2tWvXLvZxsbEVlSh49o0NrV0LL77oOo3ap0RtW0Qb9+G0bAlXXQVPPw0bN7pOEzheFPJ1QMoBXzcs+p4KgXPOgbPOskt0ZGW5ThP2tG1XpOHD7VvNSFt/xgNeFPLvgWYi0lhE4oAewGQPjqsOYd9V+ebNdrckFVLatitS06Zw88327ebq1a7TBEq5e6ONMQUi0gf4FIgGxhtjfi53MlWsdu3sTUJnnuk6SXjTtu3A/ffbMQVdHbFUPBlWNMZMA6Z5cSxVMnpzYMXQtl3BGjSAJ590nSJwfHdnpyq5TZvsTYIbNrhOopTHZs7cv1WWOiIt5AGWmWl3C9OxIRV2vvrKvu2cN891kkDQQh5g+8aGxo6191MoFTbuusuu2R6Ju8qUgRbygLvvPnujkN7lrMJKUpJdu/nTT203izosLeQB16CBXbP8jTdgcelXAFHKv/7xD6hfXzdqLgG9GT4MDBwI27bpjC0VZipXtosLbdli12+O9M04DkMLeRhITrYrRioVdq65xnWCQNCulTCycKEdGwr3d6EvvAALFrhOoSrM3r3wyivhv1FzVpYd7MrMLPVTtZCHkf/+196+/8EHrpOE1imn6PhXRBGB556ze0CG8wJDlSvbTUeWLCn1U7WQh5Heve167v/8J2Rnu04TOq1b2412VISIirILC61fD4884jpN6ERHw6232iuVUtJCHkZiYuwqoKtXwxNPuE7jvSlT7N4Du3a5TqIqXPv2cO21tmGvXOk6jfd69oTXXivz07WQh5mzz4ZLL7UXLunprtN4JzcX7rwTvv/evgNVEejRR+3VSrhtXvv553a2QjnW2tBZK2HoiSfg2GOhenXXSbzz1FOwYgV89pnuzxCxGjSwu5DXqeM6iXfy820/YZMm9kqljLSQh6HU1PBaf2X9ehgxAi66CM47z3Ua5VTPnq4TeOu55+zg5ocflmvHee1aCWNffml3zyosdJ2kfO6/3164hGO/vyqDggK7MuLYsa6TlM/OnfZ1nHcedO9erkPpFXkY27gRJk6Ec8+1i2sF1fDh0KWLffepFNHR8M03MHeuHRBKTnadqGyqVbP/QFNT7RTLctAr8jB21VVw+ul2qYqdO12nKT1j7EeDBnDZZa7TKN8QsX3lmZl21bgg2vc2uUsXO6BVTlrIw5iInX67ZUswdxR67TW70fS2ba6TKN857ji7qNbYsfaW5iAxxhbwkSM9O6QW8jDXurXtVhk9Gn791XWaktu1C+65x047DKfZN8pDw4dDjRp27fIgeecdO+WwRg3PDql95BHgoYfg1FMhLc11kpJ78EG7ld2UKfbGPqX+pEYNuwZLSorrJCW3ezcMGACtWnk6cKWFPAIcddT+WVvZ2f6/oWbmTLv/bs+ecPLJrtMoXztwtkcQGnffvvZOvbfe8nRZXr3WiSBffQWNG8OiRa6THN4DD9gZKrqZuiqxPn3g/PPtSol+9euvdgeYQYPgjDM8PbQW8gjSooX975VXwp49brMczocfwtSpULWq6yQqMNq2hRkz7PKfftWsGcyfD8OGeX5oLeQRpE4dePNNWLoU7rjDdZo/+/JL+wcmMdG2eaVK7IYb7CYUw4bZ9Zz9JC/Pri0BdnnSGO97tLWQR5hzz7WzQV56CSZNcp1mvx9+gK5d7X67SpWaCDz/vB3Rv/pqf81ZHTwYOncO6W4oWsgj0PDhdlXQ2bNdJ7Gysmx3T3JycO/vUD5Qtaq9UzI3F37+2XUa6+OP4fHH4e9/h5NOCtlpdNZKBIqNhS++gIQE10msPn3syoZffhncu62VT7RpYxfk90Pj3rDBdvmccELIFwrSK/IIta+dL1xoN6NwZcIEewfnffdBp07ucqgwkpBg75585hl3d30WFsJ119m3mxMnhnxapBbyCPfSS3bgc99YTEU79VR7p/WQIW7Or8LUjh12d5Urr7Q34VS0qCh7Nf788/uni4XydCE/g/K1UaPsshXXXQdr1lTceXftsl2ZTZvCmDEhGchXkaxGDft275df7D6YFTm/fOtW+9/rrrPFvAJoIY9wlSvb2Su5udCunZ3mGmrr1kGHDnazaKVC5qyz7N1lEybAFVdUzI7k775rl6WdPj305zqAFnLFccfBrFkQF2cX1wqlRYvsH4xVq+wyu0qF1JAhdp/AyZNDO03LGDugecUVdnZKCGeoHIq+oVWALebffWfXugd7Y47XA//Tp9t9ABIT7T0bJ57o7fGVOqQ77oCLL7ZXyuB94967157j2Wfh8svh9dchPt6745eAXpGr39Wta7tadu6EU06xFzPGeHPsrCzo0QMaNbIXRlrEVYXaV8Q/+cQu5OPl1fm779oi3r+/naFSwUUc9IpcHUJCgr1h6KGH7ADouHG226Us9v0hSEyEadPgmGP2X/UrVeGaNIEqVWz/+VtvwSWXlP1Yxtg7SvfdzXbuud7lLCW9Ild/EhtrN14ZMcKuzdKli93/s7QyM+GWW/bfC3HyyVrElWPNmtn9Pk88Ef7v/2z/eX5+6Y+zb7Dnl19sMXdYxKGchVxELheRn0WkUETaehVKuSdil4h4/XXbn3333fb7xsCyZcV3uaxdC889Z4t/crK9ms/MrLjcXtG2HcaOOsreRnzhhXZ3oVmz7Pc3bYLt2w/9nMJC+wfgnnvsgFLLlrBypW82wy1v18pPwKXAix5kUT503XV2hdB903CXLLHtOC3Nrul/4YWQlGS3lIuKsvdgPP+8vfDp08de9Jx2mtvXUEbatsNZQgK8955dL/n00+33Ro2ymzp36GAbd+fO9vvHHQcFBfbqJDsbOnaEXr3sDJV69Zy9hAOJ8WA0S0RmAP2NMXNL8vi2bduauXNL9FDlM9u22bGdKVPsei25ufb7331nB0hXrLBtvnlzdxlFZJ4xxpOraG3bEWTBgv2Ne9/uK8cdBz/9ZD+fNct+7XAT2eLadoUNdopIL6AXQKNGjSrqtMpjNWvaG+VuvdXe+fzFF/a/+wp3kyZu87mgbTtM7Jv//dBDduGt6dPtrcf7BjX3Xbn70BELuYh8AdQ9xI8GG2M+LOmJjDFjgbFgr1pKnFD5VpUqcNFFrlOUnbZtVazUVE83Rw61IxZyY4zb4VilQkTbtgoXOv1QKaUCrlyDnSJyCfAMUBvYASwwxnQuwfO2AMWttZcMbC1zKO/5LQ/4L5Pf8hxtjKldngOEoG377XcE/svktzzgv0yHbNuezFrxkojM9WrGgRf8lgf8l8lvefzIj78jv2XyWx7wZ6ZD0a4VpZQKOC3kSikVcH4s5GNdBziI3/KA/zL5LY8f+fF35LdMfssD/sz0J77rIw9nInIj0NMYc4brLEp5Rdu1e368IlclJCJnikj6ER5zp4isFJFdIrJeRJ4SEV2+WPlWSdr1AY+NE5ElJX18uNJCHlClKMaTgdbGmCTgeOBEoG/IgilVDmW4yBgAbAlFliBxVshFpIuILBOR5SJyzyF+XklEJhX9/DsRSXWc50YR2SIiC4o+eh7mWCki8n7R4zNE5NmDfv64iGwXkVUi0vWA799UdHWRWXQFvUtEfir62Zkiki4iA0VkI/A28AXQQET2isgeEal/cBZjzApjzI59pwAKgaZl+P2MF5HN+/Ic4udnisjOA34/95f2HOHqSG3LQZ7D/r88zPO8aNcrReTWA352pohsEJEVIlIAZACfAfVFJKvo40/tuui5jYFrgUdK8zpK8DrjRWSOiCwsWsp4uJfHDwljTIV/ANHACiANiAMWAi0Oesw/gBeKPu8BTHKc50bg2RIeayHwFFAFiAfOOOAY+cAtRY/7O7Ce/WMVFwBNsAX3diAbWF70szOBAmAkUAm4GPgOSAfaAd8dJtPVwC7AYK9eTizD76gj0Br4qZifnwl85KI9+fmjJG3LQabD/r88zOvwol13AvZg3yUe2K5fLWrXtYG1wMYSZPoIuKToGOke/n4ESCz6PLbo31k7123pcB+urshPwRaolcaYPGAicPDySxcBrxV9/m/gHBERh3lKc6z6wABjzG5jTI4x5n8H/HyNMeYlY8xe7OurB9QBMMZMNfYK2hhjngH+i/1Hs08hMNQYkwt0BT4tet5soLqIHHJxZGPMW8Z2rRwDvABsKu2LMsZ8DWwr7fOUp23LE2X8f+lVu56JveLucMBzC4HexphcY8y+O2OjDxem6M7baGPMB6V8HUdUlDOr6MvYog9fzwpxVcgbYP/q7pNe9L1DPsYYUwDsBGo5zAPwfyLyo4j8W0RSijlWCrZRFxTz8983TTPG7Cn6NBFARLqKyGwR2SYiO7BXGgc26C3GmJwDMh/YN1hc5t8ZY34FfgaeO9zjyqF90dvRj0XkuBCdI2hK2rb8zst2fT721vd9fm/XRV2ozYC84oKISBXgMUI41iMi0SKyANgMfG6M+S5U5/KCDnaW3BQg1RjTEvic/e8WDrYWaFTaQRsRqQS8BzwO1DHGVAdmHPSwg68KynKVEIN9m+u1+dh1IE7ErlHynxCcQ7njZbuehu2+2McUPTax6LHPcPi23QxIBf5bNF70PlBPRDZ6NZZmjNlrjDkJaAicIiLHe3HcUHFVyNdh/8Lv07Doe4d8TFHjqYYdCHGSxxiTUdSlAfAy0KaYY80BNgCPikiVooGTkqxIH4ftI9wCFBQNFnU4zOPXFT2+lohUO1RmABHpKSJHFX3eArgXmF6CPKVijNm17+2oMWYaECsiyUd4WiQoSVsPAi/b9V8PfpCIxGKL+ARsV+q+dn0oP2F/pycVffTEdheexB/f/ZSbsRMFvgK6eHlcr7kq5N8DzUSksYjEYQczJx/0mMnADUWfXwZ8aYpGH1zkOaj/+UJgyaEOVNRH2B07M+Q37FvpK48UwBiTiX2r+A6wHTtA+cVhnjIZOBc7e+U39g8mHex0YJGI7MZeCU0DBh0pT2mJSN19Yxgicgq2bYXqD2+QlKSt+57H7fpQr38csMQY86QxZim2Xa8UkR0Hz1oxxhQYYzbu+8D29xcWfb237K/SEpHaIlK96PPKwHnA0vIeN6RcjbJi+8l+wY7oDy763gPAhUWfxwPvAsuxVwNpjvM8gu1fXoj9C31siPO8jb0Cysf+o7kZ6I0dFAJbtMcU5V0EtHWcp88Bv5/ZwGmu2pbfPg7Vthzn+dP/S8d5zsB2pfwILCj6ON9hnpbAD0V5fgLud/3/7Egfeou+UkoFnA52KqVUwGkhV0qpgNNCrpRSAedkFbzk5GSTmprq4tQqAsybN2+rKeeenWWlbVuFUnFt25NCLiLjgW7AZmPMESfOp6amMnfuXC9OrdSfiEhxG3uX9jilategbVuFVnFt26uulVfx+YR5pcrgVbRdqwDwpJCbMF1QKXd9Lju/3UlhfqHrKMqBcG3XhYWFzJ8/n19//dV1FOWRCusjF5FeQC+ARo0aVdRpSy1rYRZbP9xKxpQMMudmAlCzS01aftwSgIKsAmISdYMdtV8Q2nZOTg6fffYZkydPZurUqWzcaNe4euedd7j88svJzs4mNjaWmBht20FUYf/XjDFjKdrItG3btr65C6kwt5DMuZlUO90u67Bi4Aq2f7adpHZJNH64MZWbViYmyf6a8rfn8029b0g6NYnkC5OpdWEtEpoluIyvfMCvbXvDhg3s3LmTY489loyMDC666CKSkpLo0qUL3bp1Y+fOnZx11lkAvPTSSwwfPpzzzz+f7t2706VLF5KSkhy/AlVSEf3nN3tVNovOX8SeZXs4beNpxB0VR9N/NSW2ZixxR8X96fFmryGlfwoZUzJY0X8FK/qvoNGgRjQe0ZjQLZWuVOm9+uqr9OrVi86dOzNlyhQaNGjAN998Q5s2bYiL+3PbPumkk+jWrRtTp07lzTffJDk5mSlTptCuXTsH6VVpRew88l1zdzG/3XzyNuVx3LvHEVPd/k2rcmyVQxZxgLjkONJGpHHywpNpt7oddf9Wl/Sn0slenl2R0ZUqljGG4cOHc9NNN9GpUydGjhz5+8/at29/yCIO0LFjR1577TU2bdrEjBkzqFatGkOHDq2o2Kq8XCzC06ZNG+PSlilbzMyEmebb1G9N1pKsMh+nsLDQ7F66+/ev9+bs9SKeKidgrnHQro3jtp2Xl2duuukmA5gbb7zR5OXllflYmzdvNhkZGcYYY3JycryKqMqpuLbtSdeKMeYqL45TUfYs3UOVFlU44aMTiKtz6CuUkhAREprbPvL1L61n3eh1nDDtBOJT4r2KqhwKWrvOy8tj0aJFDB06lKFDh5aru692bXvPSW5uLp07d6ZNmzaMGjWKqKiIfRPvaxHTR24KDdkrsklolkDKP1NoeHtDoip51ygrN6lMzm85zG83nxOmnkDVk6p6dmylDmf9+vVUrVqVqlWr8r///Y9KlSp5duyYmBhatmzJk08+yW+//cYbb7xBfLxeqPhNRPx5LcwtZMm1S5jXdh6563IREU+LOECNs2vQalYrJEpY0GEB2z4Nu+nHyocWLVrEqaeeSs+ePQE8LeIA0dHRPP300zzxxBP8+9//5txzzyUjQ/cL8ZuwL+SFuYX82PVHNr+9maMHHU1c/bJ3pRxJ4vGJtJ7dmvgm8SzqtojMHzJDdi6l5syZwxlnnEFhYSH33ntvyM4jItx111288847zJ07lwsuuIDCQr1Jzk/CvmtlxcAV7PhqB8e+dix1r68b8vNValCJVl+3YuesnVRtpd0rKjS2b9/O5ZdfTq1atZg5cyYpKSlHflI5XX755aSkpFCvXj3tK/eZsP6/sf2r7ax7eh0N+jaokCK+T0xSDLW61gIga1EWe3PKvY2gUn/Qr18/1q9fz8SJEyukiO/Trl07jj76aIwxzJs3r8LOqw4vrAt5tdOr0eTJJjR5rImT8+eszWHeyfNYefdKJ+dX4Wvo0KG8+uqrnHLKKU7OP2rUKNq1a8ecOXOcnF/9UVgWcrPXkL89n6i4KFLuTPF8YLOk4lPiafD3Bqx7Zh1bJ291kkGFl4yMDIwxNGnShGuuucZZjltuuYX69evTo0cPdu7c6SyHssKykK95eA1zT5xL3uY811FIezSNxNaJLL1pKTlrc1zHUQG2e/duOnTowG233eY6CjVq1ODtt9/mt99+49Zbb913A5VyJOwK+Y7/7mD1sNVU71S92FvtK1JUpShaTGyByTMsuWYJhQU62q/K5o477mDp0qVceumlrqMAcNppp/Hggw8yadIkxo8f7zpORAurWSv5GfksuXoJldMq0+y5Zq7j/C6hWQLNnm/GnsV7XEdRATVp0iRefvll7r33Xs4991zXcX43cOBAFi9eTFpamusoES1sCrkxhqU3LyVvUx6tv21NTFV/vbS61+6fNWOM0dUSVYmtXLmSXr160b59e4YPH+46zh9ERUXxxhtv/P61tm03wqZrpXBPIYXZhaSNTKNqG//O3945a6dddXGr+/57FQzr1q2jbt26vPXWW8TGxrqOc0jGGO677z5uv/1211EiUtgU8ugq0bT8uCUN72joOsphRSVEkbUgi2U9l7mOogKiQ4cOLF68mNTUVNdRiiUi5OTkMGbMGD788EPXcSJO4Au5MYZVQ1eRvTIbiRLfv62r2qoqjUc0JuPDDDKm6ZoVqnjLli1j5MiR5ObmEh0d7TrOET300EO0bNmSfv36kZ2ta/RXpMAX8i3vbWHNA2sCtUhVw34Nqdy8MsvvXE5hns5iUYd255138vDDD7Njxw7XUUokLi6Op59+mjVr1jBq1CjXcSJKoAv53uy9rOi/giotq1C/V33XcUosKi6Kpk81JfuXbDZP3Ow6jvKhqVOn8vHHHzN06FDq1KnjOk6JnXnmmVx22WU88cQTZGbqonEVxV9TO0pp7ai15K7J5divjkWi/d2lcrBaXWtx4hcnUv3s6q6jKJ/Jy8vjzjvvpHnz5vTp08d1nFJ78skn2b17N1Wr+nfSQbgJbCHPWZvDb4/+Ru3LalPjzBqu45RJjXNs7oKsAmISA/u/Qnls9OjR/Prrr0ybNq3YPTb97MBFvLKyskhMTHSYJjIEtmslOjGaer3q0eRxNwtieWXH1zv4tuG37Pp+l+soyifOOOMM7r77brp27eo6Srn07duXjh07snevrv4ZaoEt5LE1Ymn2r2bEHx3sbacST0okKj6K5X2XYwp1vQpll4odOXKk6xjldvrpp/PDDz8wbtw411HCXuAKudlrWHL9EnZ+Gx4rrsUkxZD2SBq7Zu9i01ubXMdRDn3//ff07t2b7du3u47iiSuuuIIOHTowePDgwMy8CarAFfIN4zaw6Y1N5P6W6zqKZ+reUJeqJ1dl5cCVFGQVuI6jHCgsLKRv37785z//CcSc8ZIQEZ5++mkyMjJ8t7RAuAlUIc/fns+qwauo1qEata+o7TqOZyRKaDq6KXkb8sj4SG8SikQTJkxg9uzZPProoyQlJbmO45lWrVpxyy23MGHCBLKyslzHCVviYh3htm3bmrlz55b6ecvvXE760+m0mdcmLPfD3PPLHhKOSXAdI/BEZJ4xpq2Lc5elbWdlZXHMMcfQsGFDZs+eHXb7YW7bto3CwkKSk5NdRwm84tp2YOa87Vm2h3XPrqPeLfXCsogDvxfxnDU5gR/EVSX36KOPsmHDBt5///2wK+IANWvWBGz3UXp6Oo0aNXKcKPwEppDHN44nbVQada4Jzl1uZbHts238eP6PnPjFiYGdH69Kp2fPntSvX5927dq5jhJS119/PXPmzGHx4sXExASm9ARCYP78R8VFkXJHCnG1g3eDRGlU61CNuDpxrBq0SrfPihCpqan84x//cB0j5K644gp+/fVXXn31VddRwo7vC7kxhsXXLmbzpMhYkyS6cjSp96ey69tdZEzVgc9w9ssvv9CtWzdWrVrlOkqF6N69O6eeeirDhw8nJ0f3r/WS7wv59s+2s3nCZvK35ruOUmHq/q0u8U3iWTV4ld4kFMbuv/9+ZsyYQUJCZAxwiwgPP/ww6enpvPDCC67jhBVfF3JjDCsHrSQ+NZ56t9RzHafCRMVG0fiBxuxZtoeshTplKxwtWLCASZMmcccddwRqdcPyOvvssznnnHOYNGmSdh16yNcjDlvf30rW/CyOffVYouJ8/TfHc0f1OIrqnapTqUEl11FUCAwZMoQaNWrQv39/11Eq3JtvvkmtWrV8vwlMkPi2Opq9hlVDVpHwlwTqXBs5Vyz7SJT8XsTzNuv+nuFk1qxZTJ06lYEDB1K9enXXcSpc3bp1iY2NZc+ePbpmuUd8W8iJgsaPNKbp6KaBW2vcS8vvWs68NvPYm6MryIWL448/nhEjRgRyrXGvZGZmcswxx/DQQw+5jhIWfFvIRYTaF9em5rk1XUdxqlb3WuSm57L++fWuoyiPVKtWjcGDB1OlShXXUZypWrUqZ511FqNHj2bDhg2u4wSeLwv5hnEbWHX/KgoLdD/LGmfVoMa5Nfjt4d8oyNQFtYKssLCQa6+9li+++MJ1FF8YNmwY+fn5jBgxwnWUwPNdId+7ey8rB69kx9c7IrpL5UCNH25M/tZ80v+V7jqKKof333+fCRMmsH69vrsCaNKkCbfccgtjx45l5cqVruMEmieFXES6iMgyEVkuIveU51jpz6STvymftIfTdFS7SNLJSSRfksym1zdh9uqUrYrkVdsuKCjgvvvuo0WLFlxzzTVeRgy0IUOGEBMTw1tvveU6SqCVe/qhiEQDY4DzgHTgexGZbIxZXNpj5W/PZ+3ItdTqVotqp1Urb7Sw0mxMM6KrRuu7lArkZdt+4403WLp0Ke+//37YrDfuhfr16/Pjjz/StGlT11ECzYsr8lOA5caYlcaYPGAicFFZDrT28bUU7Cig8YjGHsQKL5XqVSImMYbCgkLtK684nrTt3Nxchg0bRtu2bbn44ou9zhh4zZo1Q0R0F6Fy8KKQNwDWHvB1etH3/kBEeonIXBGZu2XLlkMeqPb/1SbtsTQST9Rdtw+lML+Qea3msWLACtdRIoUnbTs6Opr77ruPxx9/XLsLizF9+nTq16/PnDlzXEcJpAob7DTGjDXGtDXGtK1d+9C7+1RtXZVGA3St4uJExUZRrWM1No7bSPaKbNdxVJEjte2YmBh69uxJp06dHKQLhlNOOYXExEQGDRrkOkogeVHI1wEpB3zdsOh7KgSOHnI0EiusHrbadZRIoG27glStWpVBgwYxffp0pk+f7jpO4HhRyL8HmolIYxGJA3oAkz04rjqESvUq0eD2BmyasImsn3RBrRDTtl2BevfuTUpKCoMHD9YFtUqp3IXcGFMA9AE+BZYA7xhjfi7vcVXxGg1sRHTVaDaM1TviQknbdsWKj49n6NChzJkzh/nz57uOEyierH5ojJkGTPPiWOrIYmvG0mpWK6r8JXJv8a4o2rYr1g033ED79u1p0aKF6yiB4rs7O1XJJB6fiEQLe3frYloqfMTExPxexHfv3u04TXBoIQ+wzPmZfNvoW7ZP3+46ilKeuv/++znppJPIz4+cncHKQwt5gCW0SCC6SjQrB63UwSEVVk499VSWL1/O+PHjXUcJBC3kARYdH03q0FQy52Sy9cOtruMo5Znzzz+f0047jQceeIDsbL1n4ki0kAdcnRvqUPmYyqwaskoX1FJhQ0R45JFHWL9+PWPGjHEdx/e0kAdcVEwUjR9szJ6f92hfuQorHTt2pHPnzjzzzDPs3auD+ofj682XVcnUvqw2rb5tRbV2umKkCi/PPfccVapU0RUjj0ALeRiQKPm9iBfmFhJVKbzfaBUWFBIVE96vUVlpaWkAGGPIy8ujUqVKjhOFVkFBATExpS/L+q8hjKwbs47vmn5Hwc7wXuZ2eb/lrH5gtesYqoLk5+fTqVMn7rzzTtdRQmrnzp2cfPLJzJ07t9TP1UIeRpLaJZG7Ljfsi1yzZ5pR59o6rmOoChIbG8tJJ53Eiy++yMKFC13HCZlq1arx2Wef0axZs1I/Vwt5GKnapip1/1aXdaPXsXtp+N0VV5BZQN6WPCRKqJxW2XUcVYGGDx9OjRo16NevX1jeM7F69WoKCwupXbs21aqVfqxLC3mYSXs4jaiEKFbcFX6bT6wZsYY5zeeQn6F3+0WaGjVq8OCDDzJz5kzee+8913E8tWfPHjp27MjNN99c5mNoIQ8zcUfFkTo0le2fbw+rq/I9v+4h/al0ki9KJrZWrOs4yoFevXrRsmVLnn32WddRPPXYY4+xdu1a/va3v5X5GDprJQw16NOAWt1qkXBMgusonllx1wqi4qNo/Iju5xqpoqOj+eCDD6hXr57rKJ5Zs2YNI0eO5Morr6RDhw5lPo5ekYehqLio34t47rpcx2nKL+OTDDI+yuDo+46mUt3wnn6mDi8tLY3KlSuTnZ1NRkaG6zjlNmDAAESEUaNGles4WsjD2JpH1zDnL3PI3RjsYr5jxg4qH1OZhv0auo6ifKCgoIC2bdty2223uY5SLjt37uSHH37gnnvuISUl5chPOAwt5GGs9v/VpjCnkFWDVrmOUi5NHm1Cm+/bEBWnzVXZNcsvv/xyJk2axNdff+06TplVq1aNn376ibvvvrvcx9J/GWEsoVkCDe9oyMZXNrLr+12u45Ra3uY8dv9sB2xjknQ4R+139913k5KSQt++fQO5DsucOXPYvXs3lSpVIj4+vtzH00Ie5o4ecjSxdWJZ3nc5pjBY829XDV7FvJPn6XRD9ScJCQk8/vjjLFy4kJdfftl1nFLZtm0b559/Pj179vTsmFrIw1xMUgxpj6axZ9kespcHZ13nzHmZbBi3gfp/r6/TDdUhXX755XTs2JFp04K1peqwYcPYvn079957r2fH1PerEaDu9XVJvjCZ2JrBKIiF+YX8ctsvxNaOJfX+VNdxlE+JCB988AE1atRwHaXE5s2bx3PPPcett95Ky5YtPTuuXpFHAIkSYmvGYgoNm9/Z7PtbnFcPW03md5k0G92MmGp6raGKV7NmTUSE9evXM3PmTNdxDiszM5MePXpQr149RowY4emxtZBHkC3vb2HxlYtZ9+w611EOK6pSFPV61eOoK49yHUUFRM+ePbn44ov57bffXEcpVlZWFvXr1+ett96iZs2anh5bC3kEqf1/tanVrRYr+q8g84dM13GKlXp/Kse8cIzrGCpARo8eTUFBAVdffTUFBf5cxrlevXrMmDGjXHdwFkcLeQQREZq/0pzY5FgW91hMQZZ/GrwpNCy5YQnbPt8G2KxKlVTTpk158cUXmTVrFsOHD3cd5w+WLVvGZZddxpYtW0LWrrWQR5i45DhavNWC7OXZLO+73HWc3619ci2bXt8UqJk1yl+uvvpqbrrpJh566CHf9Jfn5ubSo0cPZsyYQV5eXsjOoyNJEah6p+o0faopia0TXUcBYNf3u1h17yqSL02mfu/6ruOoAHvmmWeoXr06J554ousogL1xacGCBUyePJkGDRqE7DxayCNUw7771y1xuc9nwa4CFvdYTFz9OJq/3Fy7VFS5VKlShSeffBKAvLw8YmJiiIpy07YnT57M6NGj6devH927dw/pubRrJcKtfnA180+fT2FuoZPzbxi/gZw1ObR4qwWxNYIxz1353/bt22nfvj1PP/20k/MXFhZy33330apVK0aOHBny82khj3CJLRPJmpfFyntXOjl/w34NaT27NdVOL/32VkoVp3r16jRq1IiBAwcyb968Cj9/VFQU06dP57333qNSpdAvvayFPMIlX5RMgz4NSH8qnXVjKm5+ecYnGWQtykJESGqbVGHnVZFBRBg3bhx16tTh0ksvZenSpRVy3oKCAp599lny8/NJTk6mceOK2QhFC7kibVQatbrX4tc+v7JySOivzNe/vJ5F3RaxanCwl9dV/lazZk0+/PBDcnJyOO200/j+++9Der6srCwuuugibr/9dj766KOQnutgWsgV0fHRHP/B8TTo04DEE0M3k8UYw6r7VvHLLb9Q49wa/GXCX0J2LqUAWrduzbfffkv79u05+uijQ3aeDRs20KlTJz755BNeeOEFLrnkkpCd61B01ooCQKKFZs80+/3rjGkZJLVL8myhrcK8Qpb1XMamNzZR9+a6HPP8MUTF6nWECr20tDSmTp0KQH5+Pu+//z5XXHGFZzOklixZQteuXdm6dSuTJ0/mggsu8OS4paH/ktSf5Gfks/jKxcw/bT7Zq7y7QSd3XS6pD6bS/KXmWsSVE+PHj6dHjx7cfvvtnm1IkZ2dTWxsLDNnznRSxEELuTqE2FqxnDDtBPI35zO/3Xx2zS377kI5a3PI25pHVFwULT9tSeqQVJ0rrpy55ZZb6N+/P2PGjOHSSy9lz549ZT7WggULANt9s2TJEtq0aeNRytIrVyEXkctF5GcRKRSRtl6FUu5V71CdVrNaEZ0QzYJOC9jwyoZSPT/ntxzWjVnH/HbzWXq9nTEQFROc6wZt2+EpKiqKUaNG8cwzzzBlyhQ6duxYqhkte/fuZdasWdx+++20atWKd999F7D7iLpU3rP/BFwKvOhBFuUzVf5ShVbftmLJ1UvIWZkDwN6cvawavIpa59eiWsdqf+oi2TB+A+mj09m90O61mdAigbSRaRWe3QPatsNYnz59SElJoXfv3sTFxQHw+eef8+OPP3LhhRfSrFmzPzw+JyeH3r17M3XqVLZu3UpMTAw33XQTF154oYv4f1KuQm6MWQK6Ul04q1S3Eid9edLv+33u/mk368asI/3JdKKToqnZtSYxSTE0ebwJMUkx5G/Lt9vLPWanNCY0Twhk+9C2Hf4uuugiunfv/vst/B9//DFPPfUU/fv3p3nz5nTu3JmGDRsyYMAA4uPjWbZsGX/961/p3r07Xbp0oXr16m5fwAHEi91iRGQG0N8YM/cwj+kF9AJo1KhRmzVr1pT7vMqNvbv3su3zbWRMySDjowwKswtp+WlLqrWvhjHGefETkXnGGE+6Q7RtR5bVq1czZcoUJk+ezMyZM2nVqhWzZ89GRHzdto9YyEXkC6DuIX402BjzYdFjZnCExn6gtm3bmrlzS/RQ5XOm0ICx0xf9oqSFXNu2OpyCggLnfd8HK65tHzGlMebc0ERS4UCi/FPAS0vbtjocvxXxwwnONAKllFKHVN7ph5eISDrQHpgqIp96E0spt7RtqyDxZLCz1CcV2QIUNyKUDGytwDheC3p+CP5rONoYU9vFiQ/Ttv34O/VbJr/lAf9lOmTbdlLID0dE5no148CFoOeH8HgNfuPH36nfMvktD/gz06FoH7lSSgWcFnKllAo4Pxbysa4DlFPQ80N4vAa/8ePv1G+Z/JYH/JnpT3zXR66UUqp0/HhFrpRSqhS0kCulVMD5ppCLSBcRWSYiy0XkHtd5SktEUkTkKxFZXLSOdT/XmcpCRKJF5AcRqdjdY8OYn9q2n9upn9qeiFQXkX+LyFIRWSIi7V1nOhxf9JGLSDTwC3AekA58D1xljFnsNFgpiEg9oJ4xZr6IVAXmARcH6TUAiMhdQFsgyRjTzXWeoPNb2/ZzO/VT2xOR14D/GmNeFpE4IMEYs8NlpsPxyxX5KcByY8xKY0weMBG4yHGmUjHGbDDGzC/6PBNYAjRwm6p0RKQhcAHwsussYcRXbduv7dRPbU9EqgEdgXEAxpg8Pxdx8E8hbwCsPeDrdHzQuMpKRFKBVsB3jqOU1r+Au4FCxznCiW/bts/a6b/wT9trDGwBXinq6nlZRKq4DnU4finkYUNEEoH3gDuMMWXftbiCiUg3YLMxZp7rLCr0/NROfdj2YoDWwPPGmFbAbsDX43Z+KeTrgJQDvm5Y9L1AEZFY7D+OCcaY913nKaXTgQtFZDX27f/ZIvKm20hhwXdt24ft1G9tLx1IN8bse6fyb2xh9y2/DHbGYAeEzsE28u+Bq40xPzsNVgpi94B6DdhmjLnDcZxyEZEzsbvi6GBnOfmtbfu9nfql7YnIf4GexphlIjIMqGKMGeAy0+H4YgsMY0yBiPQBPgWigfFBKuJFTgeuAxaJyIKi7w0yxkxzF0m55sO2re20ZG4HJhTNWFkJ3OQ4z2H54opcKaVU2fmlj1wppVQZaSFXSqmA00KulFIBp4VcKaUCTgu5UkoFnBZypZQKOC3kSikVcP8P6C0fdrOgGQwAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 4 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "t=np.arange(0.0,2.0,0.1)\n",
    "s=np.sin(t*np.pi)\n",
    "# draw 2 X 2 subplot, plt.subplot(row,column,row), this is the first chart\n",
    "plt.subplot(2,2,1)\n",
    "plt.title('chart 1')\n",
    "plt.plot(t,s,'b--')\n",
    "\n",
    "# this is the second chart\n",
    "plt.subplot(2,2,2) \n",
    "plt.title('chart 2')\n",
    "plt.plot(2*t,s,'r--')\n",
    "\n",
    "# this is the third chart\n",
    "plt.subplot(2,2,3)\n",
    "plt.title('chart 3')\n",
    "plt.plot(3*t,s,'m--')\n",
    "\n",
    "# this is the fourth chart\n",
    "plt.subplot(2,2,4)\n",
    "plt.title('chart 4')\n",
    "plt.plot(4*t,s,'k--')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04000d0b",
   "metadata": {},
   "source": [
    "## 字符串处理\n",
    "---\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "49cc83f6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "6,8,18,19,22,23,25,29,36,47,48,52,53,58,60\n",
      "[6, 8, 18, 19, 22, 23, 25, 29, 36, 47, 48, 52, 53, 58, 60]\n"
     ]
    }
   ],
   "source": [
    "x='6#8#18#19#22#23#25#29#36#47#48#52#53#58#60'\n",
    "#字符串替换\n",
    "print(x.replace('#',','))\n",
    "#字符串转数组\n",
    "import numpy as np\n",
    "print([int(x) for x in x.split(\"#\")])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9cf3e27",
   "metadata": {},
   "source": [
    "## 时间处理\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f088aa41",
   "metadata": {},
   "source": [
    "### 经典题目：最大连续登录时长\n",
    "1. 求出最大连续登录时长,以及对应的上线与下线时间。\n",
    "2. 求出最近一次连续登录时长,以及对应的上线与下线时间。\n",
    "* 参考：\n",
    "* [如何优雅的计算最大连续天数](https://zhuanlan.zhihu.com/p/127365126)\n",
    "* [Pandas 计算连续行为天数的几种思路](https://jishuin.proginn.com/p/763bfbd5b82d)\n",
    "* [pandas时间序列——时间基础、时间增量、时间周期、日期偏移处理](https://blog.csdn.net/FGH333xwy/article/details/111185613)\n",
    "\n",
    "### 1.python方法"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 245,
   "id": "412a586e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>userid</th>\n",
       "      <th>date</th>\n",
       "      <th>sign_in</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>2020-04-01</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>A</td>\n",
       "      <td>2020-04-02</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>A</td>\n",
       "      <td>2020-04-03</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>A</td>\n",
       "      <td>2020-04-04</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>A</td>\n",
       "      <td>2020-04-05</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  userid       date  sign_in\n",
       "0      A 2020-04-01        1\n",
       "1      A 2020-04-02        1\n",
       "2      A 2020-04-03        1\n",
       "3      A 2020-04-04        0\n",
       "4      A 2020-04-05        1"
      ]
     },
     "execution_count": 245,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#原始数据:\n",
    "import pandas as pd\n",
    "from datetime import datetime\n",
    "data_df = pd.DataFrame({\n",
    "    'userid': list('AAAAABBBBB'),\n",
    "    'date': ['2020-4-1', '2020-4-2', '2020-4-3',\n",
    "             '2020-4-4', '2020-4-5', '2020-4-16',\n",
    "             '2020-4-17', '2020-4-18', '2020-4-19','2020-4-20'],\n",
    "    'sign_in':[1,1,1,0,1,0,1,1,1,1]\n",
    "})\n",
    "#将date转化为日期数据\n",
    "data_df['date'] = data_df['date'].apply(lambda x : datetime.strptime(x,'%Y-%m-%d'))\n",
    "data_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 246,
   "id": "d70200d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "#方法1：子函数判断，往前回溯最远登录时间距当前的间隔,从0开始\n",
    "import numpy as np\n",
    "\n",
    "def fun_last(s):\n",
    "    p=0\n",
    "    #逆序循环,使用numpy便于后续和dataframe的agg连用\n",
    "    s=np.flipud(s)\n",
    "    for i, x in enumerate(s):\n",
    "    #如果本日未登录直接跳出\n",
    "        if x<1:\n",
    "            break\n",
    "        #从前1个值开始判断\n",
    "        if i>0:\n",
    "            if s[i-1]>0:\n",
    "                p=p+1\n",
    "    return p\n",
    "#测试样例\n",
    "#s=[1,0,1,1,1,1] \n",
    "#s=[0,1,0]\n",
    "#s=[0,1,1]\n",
    "#s=[0,0,1]\n",
    "#print(fun_last(s))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 247,
   "id": "7260af01",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "细粒度结果:\n",
      "  userid       date  sign_in  last_index  last_date  days\n",
      "0      A 2020-04-01        1         0.0 2020-04-01     1\n",
      "1      A 2020-04-02        1         1.0 2020-04-01     2\n",
      "2      A 2020-04-03        1         2.0 2020-04-01     3\n",
      "3      A 2020-04-04        0         0.0 2020-04-04     0\n",
      "4      A 2020-04-05        1         0.0 2020-04-05     1\n",
      "5      B 2020-04-16        0         0.0 2020-04-16     0\n",
      "6      B 2020-04-17        1         0.0 2020-04-17     1\n",
      "7      B 2020-04-18        1         1.0 2020-04-17     2\n",
      "8      B 2020-04-19        1         2.0 2020-04-17     3\n",
      "9      B 2020-04-20        1         3.0 2020-04-17     4\n",
      "历史最长连续登录天数(days=1表示仅在当日登录):\n",
      "        date  last_date  days\n",
      "2 2020-04-03 2020-04-01     3\n",
      "9 2020-04-20 2020-04-17     4\n",
      "最近1次连续登录天数:\n",
      "        date  last_date  days\n",
      "4 2020-04-05 2020-04-05     1\n",
      "9 2020-04-20 2020-04-17     4\n"
     ]
    }
   ],
   "source": [
    "#先排序\n",
    "data_df.sort_values(['userid','date'],inplace=True)\n",
    "#重编索引\n",
    "data_df=data_df.reset_index(drop=True)\n",
    "#利用窗口函数输出历史标签\n",
    "data_df['last_index'] = data_df.groupby(['userid'])['sign_in'].transform(lambda x:x.expanding().agg(fun_last))\n",
    "#根据历史标签确定日期:用列表生成式避免索引混淆\n",
    "data_df['last_date']=[x for x in data_df['date'][data_df.index-data_df['last_index']]]\n",
    "#日期做差\n",
    "data_df['days']=(data_df['date']-data_df['last_date']).astype('timedelta64[D]').astype(int)+data_df['sign_in']\n",
    "#细粒度结果\n",
    "print('细粒度结果:')\n",
    "print(data_df)\n",
    "\n",
    "#历史最大连续登录天数\n",
    "data_df['max'] = data_df.groupby('userid')['days'].transform('max')\n",
    "print('历史最长连续登录天数(days=1表示仅在当日登录):')\n",
    "print(data_df.loc[data_df['days']==data_df['max'],['date','last_date','days']])\n",
    "\n",
    "#最近1次连续登录天数\n",
    "data_df['max_date'] = data_df.groupby('userid')['date'].transform('max')\n",
    "print('最近1次连续登录天数:')\n",
    "print(data_df.loc[data_df['date']==data_df['max_date'],['date','last_date','days']])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 244,
   "id": "3cabb047",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "细粒度结果:\n",
      "  userid       date  sign_in  day_interval  day_rank  diff_value  last_date  days\n",
      "0      A 2020-04-01        1          1186       1.0      1185.0 2020-04-01     1\n",
      "1      A 2020-04-02        1          1187       2.0      1185.0 2020-04-01     2\n",
      "2      A 2020-04-03        1          1188       3.0      1185.0 2020-04-01     3\n",
      "3      A 2020-04-05        1          1190       4.0      1186.0 2020-04-05     1\n",
      "4      B 2020-04-17        1          1202       1.0      1201.0 2020-04-17     1\n",
      "5      B 2020-04-18        1          1203       2.0      1201.0 2020-04-17     2\n",
      "6      B 2020-04-19        1          1204       3.0      1201.0 2020-04-17     3\n",
      "7      B 2020-04-20        1          1205       4.0      1201.0 2020-04-17     4\n",
      "历史最长连续登录天数(days=1表示仅在当日登录):\n",
      "        date  last_date  days\n",
      "2 2020-04-03 2020-04-01     3\n",
      "7 2020-04-20 2020-04-17     4\n",
      "最近1次连续登录天数:\n",
      "        date  last_date  days\n",
      "3 2020-04-05 2020-04-05     1\n",
      "7 2020-04-20 2020-04-17     4\n"
     ]
    }
   ],
   "source": [
    "#方法2：与固定日期做差\n",
    "import pandas as pd\n",
    "from datetime import datetime\n",
    "#方便显示\n",
    "pd.set_option('display.max_rows',500)\n",
    "pd.set_option('display.max_columns',500)\n",
    "pd.set_option('display.width',1000)\n",
    "\n",
    "\n",
    "data_df = pd.DataFrame({\n",
    "    'userid': list('AAAAABBBBB'),\n",
    "    'date': ['2020-4-1', '2020-4-2', '2020-4-3',\n",
    "             '2020-4-4', '2020-4-5', '2020-4-16',\n",
    "             '2020-4-17', '2020-4-18', '2020-4-19','2020-4-20'],\n",
    "    'sign_in':[1,1,1,0,1,0,1,1,1,1]\n",
    "})\n",
    "#将date转化为日期数据\n",
    "data_df['date'] = data_df['date'].apply(lambda x : datetime.strptime(x,'%Y-%m-%d'))\n",
    "\n",
    "#只选择有登录的日期\n",
    "data_df=data_df.loc[data_df['sign_in']==1,:]\n",
    "#排序\n",
    "data_df.sort_values(['userid','date'],inplace=True)\n",
    "#重编索引\n",
    "data_df=data_df.reset_index(drop=True)\n",
    "\n",
    "#窗口函数算日期差\n",
    "data_df['day_interval'] = (data_df['date']-pd.to_datetime('2017-01-01')).astype('timedelta64[D]').astype(int)\n",
    "data_df['day_rank']= data_df.groupby(['userid'])['date'].rank(ascending=True,method='first')\n",
    "data_df['diff_value']=data_df['day_interval']-data_df['day_rank']\n",
    "\n",
    "#上一次日期\n",
    "data_df['last_date']=data_df.groupby(['userid','diff_value'])['date'].transform('min')\n",
    "#持续天数\n",
    "data_df['days']=(data_df['date']-data_df['last_date']).astype('timedelta64[D]').astype(int)+data_df['sign_in']\n",
    "#细粒度结果\n",
    "print('细粒度结果:')\n",
    "print(data_df)\n",
    "\n",
    "#历史最大连续登录天数\n",
    "data_df['max'] = data_df.groupby('userid')['days'].transform('max')\n",
    "print('历史最长连续登录天数(days=1表示仅在当日登录):')\n",
    "print(data_df.loc[data_df['days']==data_df['max'],['date','last_date','days']])\n",
    "\n",
    "#最近1次连续登录天数\n",
    "data_df['max_date'] = data_df.groupby('userid')['date'].transform('max')\n",
    "print('最近1次连续登录天数:')\n",
    "print(data_df.loc[data_df['date']==data_df['max_date'],['date','last_date','days']])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6136223c",
   "metadata": {},
   "source": [
    "### 2.SQL方法\n",
    "* 使用sqlite3包，模拟sql运行效果\n",
    "* [SQLite - Python](https://www.runoob.com/sqlite/sqlite-python.html)\n",
    "* [Python sqlite3数据库模块使用攻略](https://zhuanlan.zhihu.com/p/196807781)\n",
    "* [SQL笔试题-连续登录天数](https://yuguiyang.github.io/2017/08/31/data-analyst-interview-sql-03/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 205,
   "id": "32a6f001",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Opened database successfully\n",
      "Table created successfully\n"
     ]
    }
   ],
   "source": [
    "#创建表\n",
    "import sqlite3\n",
    "conn = sqlite3.connect('test.db')\n",
    "print(\"Opened database successfully\")\n",
    "c = conn.cursor()\n",
    "c.execute('''CREATE TABLE t_sign\n",
    "       (ID INT PRIMARY KEY     NOT NULL,\n",
    "       userid        varchar(20)    NOT NULL,\n",
    "       date           date      NOT NULL,\n",
    "       sign_in        INT     NOT NULL);''')\n",
    "print(\"Table created successfully\")\n",
    "conn.commit()\n",
    "conn.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 209,
   "id": "43c76d0b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Opened database successfully\n",
      "Records created successfully\n"
     ]
    }
   ],
   "source": [
    "#插入多条数据\n",
    "conn = sqlite3.connect('test.db')\n",
    "c = conn.cursor()\n",
    "print(\"Opened database successfully\")\n",
    "input_data = [\n",
    "    (1,'A', '2020-04-01',1),\n",
    "    (2,'A', '2020-04-02',1),\n",
    "    (3,'A', '2020-04-03',1),\n",
    "    (4,'A', '2020-04-04',0),\n",
    "    (5,'A', '2020-04-05',1),\n",
    "    (6,'B', '2020-04-16',0),\n",
    "    (7,'B', '2020-04-17',1),\n",
    "    (8,'B', '2020-04-18',1),\n",
    "    (9,'B', '2020-04-19',1),\n",
    "    (10,'B', '2020-04-20',1)\n",
    "        ]\n",
    "c.executemany('INSERT INTO t_sign VALUES (?,?,?,?)', input_data)\n",
    "conn.commit()\n",
    "print(\"Records created successfully\")\n",
    "conn.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 241,
   "id": "ae4831f3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('A', '2020-04-01', '2020-04-03', 3), ('B', '2020-04-17', '2020-04-20', 4)]\n"
     ]
    }
   ],
   "source": [
    "#主要使用row_number() 窗口函数，计算登录日期间隔和固定日期差之间的差异\n",
    "#sqlite日期差计算函数是julianday，hive更常用datediff\n",
    "\n",
    "import sqlite3\n",
    "# 连接sqlite\n",
    "conn = sqlite3.connect('test.db')\n",
    "# 使用cursor()方法创建游标对象\n",
    "cursor = conn.cursor()\n",
    "# 检索数据\n",
    "cursor.execute('''\n",
    "with tmp as (\n",
    "select \n",
    "\tuser_id,\n",
    "\tdiff_value, --差值\n",
    "\tmin(login_date) start_date, --开始日期\n",
    "\tmax(login_date) end_date, --结束日期\n",
    "\tcount(1) running_days --连续登录天数\n",
    "from (\n",
    "\tselect \n",
    "\t\tuserid as user_id,\n",
    "\t\tdate as login_date,\n",
    "\t\tjulianday(date) - julianday('2017-01-01') day_interval, -- 间隔天数\n",
    "\t\trow_number() over(partition by userid order by date) day_rank, -- 日期排序\n",
    "\t\t(julianday(date) - julianday('2017-01-01')-\n",
    "        row_number() over(partition by userid order by date)\n",
    "\t\t) diff_value\t--间隔天数与排序的差值\n",
    "\tfrom \n",
    "\t\tt_sign\n",
    "    where sign_in=1\n",
    ") base\n",
    "group by user_id,diff_value\n",
    ") \n",
    "select \n",
    "\ta.user_id,a.start_date,a.end_date,a.running_days\n",
    "from \n",
    "\ttmp a\n",
    "join (\n",
    "\tselect user_id,max(running_days) running_days from tmp group by user_id\n",
    ") b on a.user_id = b.user_id\n",
    "and a.running_days = b.running_days\n",
    "''')\n",
    "\n",
    "result = cursor.fetchall();\n",
    "print(result)\n",
    "# 在数据库中提交更改\n",
    "conn.commit()\n",
    "# 关闭连接\n",
    "conn.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7df8503a",
   "metadata": {},
   "source": [
    "## 文件处理\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 207,
   "id": "37be0289",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   NumRooms Alley   Price\n",
      "0       NaN  Pave  127500\n",
      "1       2.0   NaN  106000\n",
      "2       4.0   NaN  178100\n",
      "3       NaN   NaN  140000\n"
     ]
    }
   ],
   "source": [
    "#读写文件\n",
    "import os\n",
    "\n",
    "os.makedirs(os.path.join('..', 'data'), exist_ok=True)\n",
    "data_file = os.path.join('..', 'data', 'house_tiny.csv')\n",
    "with open(data_file, 'w') as f:\n",
    "    f.write('NumRooms,Alley,Price\\n')  # 列名\n",
    "    f.write('NA,Pave,127500\\n')  # 每行表示一个数据样本\n",
    "    f.write('2,NA,106000\\n')\n",
    "    f.write('4,NA,178100\\n')\n",
    "    f.write('NA,NA,140000\\n')\n",
    "    \n",
    "import pandas as pd\n",
    "\n",
    "data = pd.read_csv(data_file)\n",
    "print(data)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
